learninglinguist:damnprecious:learninglinguist:Link to the original tweet I got tagged in this post
learninglinguist:damnprecious:learninglinguist:Link to the original tweet I got tagged in this post so I’m just gonna go off, I guess.These tweets are from 2017. Now in 2020, I went to test out how google translate would translate similar sentences in Finnish, as Finnish also only has a gender neutral “hän” for he/she. Here’s what I got for results:Both feminine and masculine translation options. Even for firefighter, which still directly translates as “fireman”. Similarly, I got both masculine and feminine results when I searched for a sentence in Turkish to English, but as I don’t know Turkish I didn’t try more than once as I cannot tell how correct it is otherwise. According to my quick googling, this feature was added in late 2018.More interestingly, when I translated from Finnish to French, I was no longer offered two options as translations. Google translate also offered a general info page on this phenomenon, explaining how the software offers both masculine and feminine translations for some gender neutral words and sentences in some languages, and that more are in development, which is what I’m assuming is why the search between Finnish and French only supplied one version at this time. It takes time to work through all possible language pairs.If I translated several sentences in one go it lost the second translation like so:This type of a translation is more complex for a machine translating software to understand as there is much more info to translate than a single, simple sentence. The actual translation would also become more difficult to use if all the senteces were offered twice as masculine and feminine as the text would become repetitive and more difficult to read. Translations made by humans also don’t offer all possible gender options but rely on the larger context on what translation solutions are made. In these simple out-of-context sentences the machine picks the option that is statistically the most common. The machine would likely use wrong pronouns even in cases where there is enough context for a human translator to figure out the gendered pronouns as translation software at this time is far from perfect.Softwares have limitations, far beyond simply gendered vs non-gendered pronouns because all languages have complex features that don’t have a perfect match in the target language. Could google translate be coded to offer consistently only masculine or only feminine translation options? Possibly. That would not be ideal either. Could it be coded to only use gender neutral pronouns for languages where gendered pronouns are used? Who knows. It definitely would not reflect how the language is actually used. Some languages simply use gendered language, and the software functions in that context and with that corpus material. Would that change in the future if the language in question moves towarda gender neutral pronouns? Definitely. Because statistics.A human translator would have the same issue of not knowing which gendered pronoun to use in a translation if there is no context and the source language is gender neutral. However a human translator is much more free to choose a gender neutral singular they or include a he/she (in the case of English) or consistently use one gendered pronoun for a translation, although the commussioner may affect this. A software that relies on corpus matches and statistics can’t make a conscious choice. These matches based on statistics don’t simply come down to tech industry being predominantly male. The people who code the software do not write the texts of the corpus. The corpus is collected from existing texts. Google translate uses a corpus that consists of millions of documents, which include documents from the United Nations and the European Parliament. Google translate is currently based on a neural machine translation principle, which means it translates by predicting the most likely sequence of words based on the massive corpus.English has gendered pronouns. There are fields where majority of work force is male or female, which is more likely to show in the texts the software uses. It makes a guess on what pronouns to use based on that context. The software itself isn’t inherently sexist in this case. It just works with the existing material.Quite frankly I’m amazed that google translate even offers the simple sentences with both masculine and feminine translations. I’m not surprised it offers only one option if more things are translated at once. Google translate is an useful tool if you need a quick translation that doesn’t have to be that good. It shouldn’t be relied on as anything more than that, especially in languages that have less material it can use as a corpus.Most importantly, we need human translators, the machine translators cannot do what human translators can.Source: a human translator.Tl;dr - google translate has been improved, it offers both masculine and feminine options for simple translations, and relies on a massive corpus that’s collected from existing sources that are not written by the people who create the software.Since I’m newly Back On Tumblr, I’ll once again reblog this post that has haunted me for three years. I really love this addition because it distinguishes between unintentional biases that come from the people who create the technology vs. the huge collection of information that teaches the tech. Thanks for breaking down the logistics of Google Translate’s abilities!! -- source link
#machine translation#bias