[PDF] Domain Adaptation for Arabic Cross-Domain and Cross-Dialect

[PDF] Arabic Cross-dialectal Conversations with Implications for the

The analysis also showed that dialect familiarity has a major role in aiding comprehension between the native speakers of different Arabic dialects The second

[PDF] Domain Adaptation for Arabic Cross-Domain and Cross-Dialect

6 jui 2021 · bic (MSA) and Arabic dialect MSA has a stan- dard written form and acquires an official status across the Arab countries, while Dialectal

[PDF] Advancing Arabic Language Teaching and Learning

MSA is expected to be the language of classroom instruction throughout the Arabic speaking world (including most MENA countries) As soon as children enter

[PDF] The Evolution of the Arabic Language Through Online Writing

spoken forms of Arabic in modern day Egypt, concluding that Egyptians use MSA lexicon and that this is true across different Arabic speaking countries,

Comparative Evaluation of Sentiment Analysis Methods Across

dialects that varies significantly across the Arab world, Multi-Dialect Arabic Sentiment Twitter Dataset (MD-ArSenTD) that is composed of tweets

[PDF] Domain Adaptation for Arabic Cross-Domain and Cross-Dialect

2450_42021_naacl_main_226.pdf

2830SOTAOur Results

Source TargetDANN

BOWADRLZS-BERT CORAL MMD DANN Ours

Jordan

Lebanon29 3047 50 50.9 49.352

Palestine34.5 3547.5 50.3 51.1 51.252.8

Syria32 3351.7 53.3 53.2 51.954.2Lebanon

Jordan29 3245 46.8 47.1 47.448.8

Palestine31 3542.7 50.5 50.7 5152.4

Syria37 37.549.6 50.7 51.1 5052Palestine

Jordan32 32.545 50.6 49.7 47.452.4

Lebanon31 3142 50 50.5 50.551.9

Syria28.5 27.551.7 52.4 52.4 51.353.7Syria

Jordan30.5 3244.7 48.5 49.1 49.451

Lebanon35 35.546.1 51.5 51.1 50.652

Palestine31.5 37.547.1 49.7 49.8 51.352.9Table 1: The results of accuracy measurement of

Arabic cross-dialect sentiment analysis using the

ArSentD-LEV dataset. The SOTA results are taken

from (

Khaddaj et al.

, 2019
). zero-shot transfer-based method, outperforms both

DANNBOWand ADRL, the state-of-the-art do-

main adaptation methods that are based on the bag-of-words representation. Moreover,training the state-of-the-art domain adaptation methods, in- cluding CORAL, MMD and DANN, on top of

BERT module has improved BERT transfer per-

formance across dialects. Besides, these three methods achieve comparable performance for most source and target dialects and outperform both

DANNBOWand ADRL. Furthermore, our method,

which is based on BERT and ALDA"s losses, sur- passes the existing state-of-the-art methods and

ZS-BERT with average improvements of 19% and

5.5% respectively. Additionally, it shows better per-

formance than the other domain adaptation meth- ods that are implemented on top of BERT (CORAL,

MMD, and DANN).

In accordance with the results obtained for

cross-dialect, Table 2 s howsthat the ZS-BER T method outperforms both DANNBOWand ADRL in most test cases of cross-domain sentiment analysis (14 out of 20 cases). Besides, the results show that the three domain adaptation methods

CORAL, MMD, and DANN outperform both

DANNBOWand ADRL, and improve the transfer

performance of BERT model. On average, the latter three methods (CORAL, MMD, and DANN) are on a par with each other in terms of accuracy.

Similarly, our proposed method outperforms both

state-of-the-art methods (DANNBOWand ADRL) as well as ZS-BERT by an average increment of 19% and 10.7%, respectively. Moreover, it achieves a better performance than CORAL, MMD, and DANN for most source and targetSOTAOur Results

Source TargetDANN

BOWADRLZS-BERT CORAL MMD DANN Ours

Politics

Personal28.7 33.328.7 41.6 41.3 4344.3

Religious20.3 25.310 33.6 33.3 34.246.3

Sport35.1 35.136.7 46.6 32.846.8 46.8

Other22.5 24.238.249.750 39.7 46.1Personal

Politics41.7 36.846.3 49.7 49.4 47.549.7

Religious22.8 23.441 44.344.743.5 44.2

sport26.8 25.843.549.749.5 48.2 46.6

Other33.8 35.453 57.4 57.7 49.658Religious

Politics15.5 15.5124242 37.6 40.8

Personal24.1 26.125 35.1 37 36.838

Sport25.8 26.821.638.132.8 28.5 34.8

Other30.6 27.426.4 46.4 43.2 43.248.4Sport

Politics36.4 30.746.948.748.3 43.1 44.6

Personal25.3 24.540.7 43.8 42.3 43.644.5

Religious20 1930.8 29.2 31 40.244

Other35.5 35.548.3 49 49.6 4954.2Other

Politics23.2 23.246.846.5 46.4 34.446.8

Personal30.3 24.940.246.244.3 40.3 45.5

Religious41.8 4339.5 45.8 47.6 48.648.9

Sport23.7 27.846.7 48.451.147.7 50.9

Table 2: The results of accuracy measurement of

Arabic cross-domain sentiment analysis using the

ArSentD-LEV dataset. The SOTA results are taken

from (

Khaddaj et al.

, 2019
). domains (12 out of 20 cases).

Scenario 2: Domain adaptation across regional

dialects. Table 3 summarizes the results obtained for cross-domain and cross-dialect as well as cross- domain and cross-dialect Arabic sentiment analysis using two regional dialects (Gulf and Levantine) and MSA data, covering three domains (books re- views, hotels reviews and Twitter).

The overall obtained results show that the

zero-shot transfer from AraBERT (ZS-BERT) outperforms previous state-of-the-art methods (PBLM and HTAN). Moreover, the evaluated domain adaptation methods on top of BERT improve AraBERT"s performance for all evaluated scenarios. Besides, the results demonstrate that the performance of ZS-BERT method drops significantly in the cases of cross-domain as well as in cross-domain and cross-dialect scenarios.

Nevertheless, the domain adaptation methods

show more important improvements (an increment of 7.4% on average) in the scenarios mentioned above. The obtained results clearly show that our method surpasses the other methods for most target datasets and scenarios, except for some cases but the gap remains small.

Scenario 3: Domain adaptation from MSA to

Arabic dialects using social media data.

Table 4 presents the domain adaptation results obtained

Politique de confidentialité -Privacy policy