hexis-mod-de

Model type: bert

Downstream task: binary text classification

Finetuning Emissions: 0.04696735 CO₂-equivalents [CO₂eq], in kg


Benchmarks

Accuracy: 0.9708380520951302
F1: 0.956668068994531

Accuracy: 0.9548093915677859
F1: 0.9526329716856312

Accuracy: 0.8393378773125608
F1: 0.7299509001636661

  • Self-consistency
Accuracy: 0.9997699739303788
F1: 0.999708850931677


Notes
  • Self-consistency test: Evaluation on all training data.
  • Train-test splits: If the dataset is not divided into train and test portions, a 70-30 train-test split is performed.

Debiasing

Test terms from StereoSet as contained in the training data. Showing the difference in Attention Entropy before and after Optimization of Information Flow. Unit: Entropy Bits

Singapur

0.0012903858294235967

Finnland

0.0010515019273751659

Architekt

0.0008392753362104693

Brasilien

0.0006291125368561264

Taiwan

0.0004480721330470302

Columbia

0.0004223893905190395

Oma

0.0003308946202600293

Trainer

0.0003308946202600293

Verkäufer

0.0003178239387944564

Korea

0.00026210155991490886

Freundin

0.00026003776810455524

Pakistan

0.00024215157241482392

Künstler

0.00021738607069058058

Vater

0.00021429038297505015

Frau

0.00019873550766439312

Freund

0.00019445505057639467

Herr

0.0001922766036646116

Mann

0.0001891809159490812

Autor

0.00018298954051802036

Dame

0.00016510334482828905

Peru

0.00016028783060413062

Deutschland

0.0001578370778293357

Krim

0.00015444042047479538

Tochter

0.00015409645517306978

Nepali

0.00015272059396616736

Griechenland

0.00015203266336271615

seine

0.00015065680215581374

Frankreich

0.00014842102769459733

sie

0.00014830196278187014

Schriftsteller

0.00014377749612130172

Schauspieler

0.0001410257737074969

Österreich

0.0001395352574004464

Iran

0.0001386180165954177

Chef

0.00013552232887988725

Russland

0.00013380250237125923

er

0.00012849819535011413

Portugal

0.0001252033698281192

Professor

0.00012451543922466799

China

0.00011935595969878395

Islam

0.00011863675952279807

Deutsch

0.00011687578883971488

Afrika

0.00011557234137980233

Indien

0.00011098613735594006

Ukraine

0.00010869303534529029

Koch

0.00010146976400905264

Chile

0.00010112579870732703

Mohammed

9.49344232762662e-05

Paraguay

9.355856206936379e-05

Arzt

9.080683965555897e-05

Australien

8.771115194002855e-05

Venezuela

7.77361581899861e-05

Lehrer

6.810512974166923e-05

Mexiko

6.019392780198038e-05

Christ

5.6410309482998755e-05

Journalist

5.159479525884033e-05

Sudan

4.012928520174738e-05

Katar

3.577239137946263e-05

Nepal

3.026894655185299e-05

ihm

-2.126330955422975e-06

Polen

-1.3987922269320242e-05

Europa

-3.474049547428582e-05

Guatemala

-0.0002682929353459697


Notes

  • Higher entropy scores indicate an increase of Information Flow between the layers of the neural network. This also increases the interrelatedness between a term and its surroundings and reduces overfitting.

  • Higher is not automatically better: Depending on the base model and task specific training data, optimization at training time has equally valid reasons for reducing entropy scores.