hexis-mod-es

Model type: bert

Downstream task: binary text classification

Finetuning Emissions: 0.02784138 CO₂-equivalents [CO₂eq], in kg


Benchmarks

Accuracy: 0.915952380952381
F1: 0.8361948955916474

Accuracy: 0.953437613677701
F1: 0.8315789473684211

Accuracy: 0.750625
F1: 0.7135678391959799

  • Self-consistency
Accuracy: 0.9986688449204333
F1: 0.9977551020408164


Notes
  • Self-consistency test: Evaluation on all training data.
  • Train-test splits: If the dataset is not divided into train and test portions, a 70-30 train-test split is performed.

Debiasing

Test terms from StereoSet as contained in the training data. Showing the difference in Attention Entropy before and after Optimization of Information Flow. Unit: Entropy Bits

secretario

8.736718663830295e-05

editor

4.540341982777949e-05

Columbia

3.577239137946263e-05

él

2.7654810258225867e-05

dama

2.201377931043854e-05

Polonia

1.1694820258670474e-05

arquitecto

9.631028448316861e-06

piloto

7.3952539871004465e-06

compositor

5.503444827609635e-06

académico

-1.3758612069024087e-06

científico

-3.4396530172560216e-06

islam

-6.191375431060839e-06

doctor

-1.58224038793777e-05

ella

-2.4249553771654954e-05

profesor

-3.611635668118823e-05

abogado

-3.989997500016985e-05

padre

-4.9874968750212316e-05

corredor

-5.021893405193792e-05

hija

-5.297065646574274e-05

iglesia

-5.4346517672645145e-05

India

-7.016892155202284e-05

Europa

-7.016892155202284e-05

México

-7.204509592460563e-05

Rusia

-8.484477442479429e-05

mujer

-8.75964968385991e-05

Bolivia

-9.046287435383337e-05

cantante

-9.103614985713649e-05

Alemania

-9.110166705740417e-05

esposa

-9.49344232762662e-05

su

-9.665424978489421e-05

entrenador

-0.00010800510474183908

hombre

-0.00010965613819037824

Paraguay

-0.0001286430228453752

poeta

-0.0001410257737074969

director

-0.0001430895655178505

criada

-0.00015203266336271615

Francia

-0.0001605826580059758

señor

-0.0001662498958348953

Irlanda

-0.00017954988750076432

hermano

-0.00018642919353527637

Katar

-0.00018711712413872758

pintor

-0.00021876193189748297

Chile

-0.0002344696806767126

escritor

-0.0002356162316820375

Venezuela

-0.00024228055940297104

Brasil

-0.00025155329066156324

Albania

-0.00027861189439773774

Austria

-0.0003026894655185299

Singapur

-0.0003370859956910901

Portugal

-0.0003845532073292232

Argentina

-0.0004296699894047105

hijo

-0.0005303944952608785

fotógrafo

-0.0006810512974166923

Ghana

-0.0007746098594860561

vendedor

-0.0010239847032371178


Notes

  • Higher entropy scores indicate an increase of Information Flow between the layers of the neural network. This also increases the interrelatedness between a term and its surroundings and reduces overfitting.

  • Higher is not automatically better: Depending on the base model and task specific training data, optimization at training time has equally valid reasons for reducing entropy scores.