hexis-mod-pt

Model type: bert

Downstream task: binary text classification

Finetuning Emissions: 0.04407077 CO₂-equivalents [CO₂eq], in kg


Benchmarks

Accuracy: 0.965603502188868
F1: 0.9353701527614571

Accuracy: 0.8863693625598388
F1: 0.812785388127854

Accuracy: 0.9274829931972789
F1: 0.804904831625183

  • Self-consistency
Accuracy: 0.9986184505923393
F1: 0.9969111969111969


Notes
  • Self-consistency test: Evaluation on all training data.
  • Train-test splits: If the dataset is not divided into train and test portions, a 70-30 train-test split is performed.

Debiasing

Test terms from StereoSet as contained in the training data. Showing the difference in Attention Entropy before and after Optimization of Information Flow. Unit: Entropy Bits

jornalista

8.667925603485174e-05

igreja

8.564736012967494e-05

professor

5.709824008644996e-05

caixa

5.400255237091954e-05

dela

4.6779281034681895e-05

Europa

4.127583620707226e-05

Venezuela

6.879306034512043e-06

Português

-4.8155142241584306e-06

hindu

-1.3070681465572882e-05

secretário

-2.4765501724243356e-05

filha

-2.510946702596896e-05

exército

-2.545343232769456e-05

filho

-2.6141362931145765e-05

seu

-4.555083352742071e-05

sua

-4.804652162045883e-05

editor

-5.365858706919394e-05

cara

-5.4174535021782344e-05

ele

-5.549880143265709e-05

CEO

-6.26016849140596e-05

advogado

-7.842408879343729e-05

médico

-8.323960301759573e-05

França

-8.897235804550152e-05

treinador

-9.03482192536853e-05

Brasil

-9.34725707439324e-05

Brasileiro

-9.49344232762662e-05

agricultor

-9.631028448316861e-05

esposa

-9.734218038834542e-05

mulher

-0.00010181372931077824

pai

-0.00010387752112113185

China

-0.00011450222710876374

Rússia

-0.00011557234137980233

Portugal

-0.00011940509759866437

ela

-0.00011945914928955791

piloto

-0.00013277060646608245

Alemanha

-0.00014981599808507133

compositor

-0.00017335851206970348

Taiwan

-0.00017519299367847957

Irlanda

-0.00018849298534563

México

-0.00020041711580459661

modelo

-0.0002084429728457149

fotógrafo

-0.00021601020948367817

diretor

-0.00021669814008712937

vendedor

-0.00024043174590619593

Japão

-0.00026760500474251847

irmão

-0.0002724205189666769

África

-0.0003291747937514013

guarda

-0.0003783618318981624

Nepal

-0.0003916618235646721

cientista

-0.00041860577220005787

Senhor

-0.00045128247586399004

Peru

-0.0008485623993570605

Chile

-0.001293911473766284


Notes

  • Higher entropy scores indicate an increase of Information Flow between the layers of the neural network. This also increases the interrelatedness between a term and its surroundings and reduces overfitting.

  • Higher is not automatically better: Depending on the base model and task specific training data, optimization at training time has equally valid reasons for reducing entropy scores.