hexis-mod-it

Model type: bert

Downstream task: binary text classification

Finetuning Emissions: 0.02776032 CO₂-equivalents [CO₂eq], in kg


Benchmarks

Accuracy: 0.9115553121577218
F1: 0.8438859352344127

Accuracy: 0.7521270561542824
F1: 0.7130663164806303

Accuracy: 0.9971461187214612
F1: 0.9971574758385446

  • Self-consistency
Accuracy: 0.9921568627450981
F1: 0.9899069547390001


Notes
  • Self-consistency test: Evaluation on all training data.
  • Train-test splits: If the dataset is not divided into train and test portions, a 70-30 train-test split is performed.

Debiasing

Test terms from StereoSet as contained in the training data. Showing the difference in Attention Entropy before and after Optimization of Information Flow. Unit: Entropy Bits

sposo

8.736718663830295e-05

Islam

6.604133793131562e-05

Russia

-3.439653017256022e-05

francese

-4.050191427818965e-05

cantante

-5.228272586229153e-05

capo

-5.847410129335237e-05

sua

-6.569737262959002e-05

Irlanda

-6.604133793131562e-05

Germania

-7.796546839156362e-05

uomo

-7.91120193968885e-05

Singapore

-8.667925603485174e-05

lui

-0.00011449744981190982

Austria

-0.00011626027198325353

medico

-0.0001258913004315704

figlio

-0.0001417137043109481

ragazzo

-0.00014973956135078503

ragazza

-0.00015077145725596184

pilota

-0.0001589119693972282

Pakistan

-0.0001740464426731547

Grecia

-0.0001747343732766059

padre

-0.0001902816049156282

Cina

-0.00019554427403100483

Francia

-0.0002051753024793217

Giappone

-0.000206837801437235

donna

-0.00022013779310438539

Ghana

-0.00022288951551819022

Indonesia

-0.0002249533073285438

artista

-0.00022839296034579985

guardia

-0.00023238295784504801

giudice

-0.00025522225388039684

Polonia

-0.0002641653517252625

Argentina

-0.00028136361681154257

Europa

-0.00030888084094959076

Taiwan

-0.0003288308284496757

broker

-0.0003370859956910901

autore

-0.00035187650366529105

modello

-0.0003525644342687422

lei

-0.00037492217888090637

architetto

-0.00038730492974302805

direttore

-0.0004313324883639051

poeta

-0.00044027558620877077

Kenya

-0.0004774238387951358

marito

-0.0004893479692553839

Bolivia

-0.0005847410129335237

Paraguay

-0.0006549099344855465


Notes

  • Higher entropy scores indicate an increase of Information Flow between the layers of the neural network. This also increases the interrelatedness between a term and its surroundings and reduces overfitting.

  • Higher is not automatically better: Depending on the base model and task specific training data, optimization at training time has equally valid reasons for reducing entropy scores.