hexis-mod-ar

Model type: bert

Downstream task: binary text classification

Finetuning Emissions: 0.07969575 CO₂-equivalents [CO₂eq], in kg


Benchmarks

Accuracy: 0.9324768988054992
F1: 0.9593663538219497

Accuracy: 0.9416231608922638
F1: 0.9245306172536506

Accuracy: 0.8973760932944607
F1: 0.9310884886452623

  • Self-consistency
Accuracy: 0.9998576975462992
F1: 0.9998960174690652


Notes
  • Self-consistency test: Evaluation on all training data.
  • Train-test splits: If the dataset is not divided into train and test portions, a 70-30 train-test split is performed.

Debiasing

Test terms from StereoSet as contained in the training data. Showing the difference in Attention Entropy before and after Optimization of Information Flow. Unit: Entropy Bits

عامل

0.0006005634168129014

الجهاد

0.0005172091586945169

فتاة

0.00028755499224260343

القرآن

0.0002579739762942016

كاتب

0.0002504067396562384

الصين

0.00017244127126467477

هو

1.719826508628011e-05

مصر

1.2382750862121678e-05

هي

4.127583620707226e-06

فرنسا

-1.56504212285149e-05

شاعر

-2.2701709913889743e-05

رئيس

-3.370859956910901e-05

رجل

-3.474049547428582e-05

إيران

-5.021893405193792e-05

له

-5.1938760560565925e-05

ها

-6.164730231743285e-05

الأردن

-7.464047047445568e-05

أوروبا

-0.00010043786810387584

سيد

-0.00010387752112113185

ألمانيا

-0.00010525338232803427

المغربية

-0.00011006889655219269

شريف

-0.0001107568271556439

ابن

-0.00013001888405227762

مستشار

-0.00013414646767298484

زوج

-0.00013758612069024088

حبيب

-0.00013903077495723212

مسيحي

-0.0001410257737074969

أكاديمي

-0.00015134473275926495

روسيا

-0.00020018780560430047

بنت

-0.00022300417061833829

البرازيل

-0.0002501610501542736

لها

-0.00026660750536674544

الشريعة

-0.0002696687965528721

نموذج

-0.00029581015948401787

الحكم

-0.00031988773060481003

عالم

-0.00032126359181171245

اليونان

-0.0003439653017256022

مدرس

-0.000350156677156663

السويد

-0.00040656698663966176

كنيسة

-0.0004100066396569178

الهند

-0.0004275488700449235

السوري

-0.0004708884980623494

اليابان

-0.0006963577533434816

محمد

-0.0008466361936668846

الزوج

-0.000878487380607188

دبلوماسي

-0.000924234765736693


Notes

  • Higher entropy scores indicate an increase of Information Flow between the layers of the neural network. This also increases the interrelatedness between a term and its surroundings and reduces overfitting.

  • Higher is not automatically better: Depending on the base model and task specific training data, optimization at training time has equally valid reasons for reducing entropy scores.