• 
    

    
    

      99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

      機(jī)器翻譯的風(fēng)險(xiǎn)

      2017-05-02 22:43:30ByArthurGoldhammer
      英語(yǔ)學(xué)習(xí) 2017年4期
      關(guān)鍵詞:修女語(yǔ)料庫(kù)譯者

      By+Arthur+Goldhammer

      The ideal translator is a person “on whom nothing is lost,” said Henry James. Or maybe its a machine. But a machine wont stop you from swearing at nuns...

      Years ago, on a flight from Amsterdam to Boston, two American nuns seated to my right listened to a voluble1 young Dutchman who was out to discover the United States. He asked the nuns where they were from. Alas, Framingham, Massachusetts was not on his itinerary, but, he noted, he had“shitloads of time and would be visiting shitloads of other places”.2

      The jovial young Dutchman had apparently gathered that“shitloads” was a colourful synonym for the bland “l(fā)ots”.3 He had mastered the syntax of English and a rather extensive vocabulary but lacked experience of the appropriateness of words to social contexts.4

      This memory sprang to mind with the recent news that the Google Translate engine would move from a phrase-based system to a neural network. Both methods rely on training the machine with a “corpus”5 consisting of sentence pairs: an original and a translation. The computer then generates rules for inferring, based on the sequence6 of words in the original text, the most likely sequence of words from the target language.

      The procedure is an exercise in pattern matching. Similar pattern-matching algorithms are used to interpret the syllables you utter when you ask your smartphone to “navigate to Brookline” or when a photo app tags your friends face.7 The machine doesnt “understand” faces or destinations; it reduces them to vectors8 of numbers, and processes them.

      I am a professional translator, having translated some 125 books from the French. One might therefore expect me to bristle9 at Googles claim that its new translation engine is almost as good as a human translator, scoring 5.0 on a scale of 0 to 6, whereas humans average 5.1. But Im also a PhD in mathematics who has developed software that “reads” European newspapers in four languages and categorises the results by topic. So, rather than be defensive about the possibility of being replaced by a machine translator, I am aware of the remarkable feats of which machines are capable, and full of admiration for the technical complexity and virtuosity of Googles work.10

      My admiration does not blind me to the shortcomings of machine translation, however. Think of the young Dutch traveler who knew “shitloads” of English. The young mans fluency demonstrated that his “wetware”—a living neural network, if you will—had been trained well enough to intuit the subtle rules (and exceptions) that make language natural.11 Computer languages, on the other hand, have context-free grammars. The young Dutchman, however, lacked the social experience with English to grasp the subtler rules that shape the native speakers diction, tone and structure. The native speaker might also choose to break those rules to achieve certain effects. If I were to say “shitloads of places”rather than “l(fā)ots of places” to a pair of nuns, I would mean something by it. The Dutchman blundered into inadvertent comedy.12

      Googles translation engine is “trained” on corpora ranging from news sources to Wikipedia. The bare description of each corpus is the only indication of the context from which it arises. From such scanty13 information it would be difficult to infer the appropriateness or inappropriateness of a word such as “shitloads”. If translating into French, the machine might predict a good match to beaucoup or plusieurs. This would render the meaning of the utterance but not the comedy,14 which depends on the socially marked“shitloads” in contrast to the neutral plusieurs. No matter how sophisticated the algorithm, it must rely on the information provided, and clues as to context, in particular social context, are devilishly15 hard to convey in code.

      The problem, as with all previous attempts to create artificial intelligence (AI)16 going back to my student days at MIT, is that intelligence is incredibly complex. To be intelligent is not merely to be capable of inferring logically from rules or statistically from regularities. Before that, one has to know which rules are applicable, an art requiring awareness of sensitivity to situation. Programmers are very clever, but they are not yet clever enough to anticipate the vast variety of contexts from which meaning emerges. Hence even the best algorithms will miss things—and as Henry James put it, the ideal translator must be a person “on whom nothing is lost”.

      This is not to say that mechanical translation is not useful. Much translation work is routine. At times, machines can do an adequate job. Dont expect miracles, however, or felicitous literary translations, or aptly rendered political zingers.17 Overconfident claims have dogged18 AI research from its earliest days. I dont say this out of fear for my job: Ive retired from translating and am devoting part of my time nowadays to…writing code.

      亨利·詹姆斯說,理想的譯者應(yīng)該是“一無所失”之人?;蛘?,是一無所失之機(jī)器。但是,機(jī)器可不會(huì)教你不能在修女面前爆粗口。

      幾年前,我從阿姆斯特丹乘機(jī)前往波士頓,兩位美國(guó)修女坐在我右邊,聽一個(gè)正要去探索美國(guó)的荷蘭小伙子侃侃而談。他問修女從哪兒來。啊,馬薩諸塞州的弗雷明漢,可惜不在他的行程計(jì)劃之內(nèi)。但是他說,他有“賊他媽多的時(shí)間,可以去賊他媽多的其他地方”。

      這個(gè)熱情友好的荷蘭小伙子顯然知道,“賊他媽多”跟普普通通的“很多”比起來,有趣得多。他掌握了英語(yǔ)的句法,有相當(dāng)豐富的詞匯量,卻缺乏交際經(jīng)驗(yàn),來判斷用詞是否合乎語(yǔ)境。

      想起這件事,是因?yàn)橛行侣務(wù)f,谷歌翻譯引擎將從一個(gè)基于短語(yǔ)的系統(tǒng),變成一個(gè)神經(jīng)網(wǎng)絡(luò)系統(tǒng)。兩種方法都以語(yǔ)料庫(kù)為基礎(chǔ),訓(xùn)練計(jì)算機(jī)掌握多個(gè)由原文和譯文搭配組合的句子。計(jì)算機(jī)由此總結(jié)出一套規(guī)則,可以根據(jù)原句的詞語(yǔ)排列,推導(dǎo)出目標(biāo)語(yǔ)言最有可能的詞語(yǔ)排序。

      整個(gè)過程屬于模式匹配的訓(xùn)練。當(dāng)智能手機(jī)識(shí)別你的語(yǔ)音提問“導(dǎo)航到布魯克萊恩”,或者當(dāng)拍照軟件識(shí)別你朋友的面部時(shí),運(yùn)用的也是類似的模式匹配算法。計(jì)算機(jī)并不能“理解”人臉或者目的地,而是把它們變成向量,再進(jìn)行處理。

      我是專業(yè)譯者,譯了差不多有125本法語(yǔ)書。有人因此可能會(huì)覺得,我看到谷歌的下述言論會(huì)很生氣:谷歌新的翻譯引擎跟人工譯者一樣好;若滿分6分,谷歌可以打到5分,而人類的平均水平也只有5.1分。但我同樣也是數(shù)學(xué)博士,我開發(fā)出來的軟件可以“閱讀”歐洲四種語(yǔ)言的報(bào)紙,再按主題將它們歸類。所以,我對(duì)機(jī)器翻譯取代人工翻譯并沒有多大戒心,反而非常清楚機(jī)器所取得的非凡成就,相當(dāng)佩服谷歌復(fù)雜而精湛的技術(shù)。

      佩服歸佩服,我也不會(huì)對(duì)機(jī)器翻譯的缺陷視而不見。想想那個(gè)會(huì)說“賊他媽多”的荷蘭年輕人,他流利的英語(yǔ)顯示他的“濕件”—— 一個(gè)活生生的神經(jīng)網(wǎng)絡(luò)系統(tǒng)——已經(jīng)訓(xùn)練得足以感覺出一些細(xì)微規(guī)則(和例外),從而使語(yǔ)言自然流暢。相反,計(jì)算機(jī)語(yǔ)言則是純粹脫離語(yǔ)境的語(yǔ)法。然而,那位年輕的荷蘭人因缺乏英語(yǔ)社會(huì)經(jīng)驗(yàn)而無法掌握母語(yǔ)使用者在措辭、語(yǔ)氣和句子結(jié)構(gòu)方面更微妙的規(guī)則。當(dāng)然,母語(yǔ)使用者也可能有意打破這些規(guī)則,以達(dá)到某種效果。如果我對(duì)兩個(gè)修女說“賊他媽多地方”,而不是“很多地方”,我可能是話里有話。那個(gè)荷蘭人在誤打誤撞中造成了一種喜劇效果。

      谷歌翻譯引擎所用的語(yǔ)料庫(kù)來自各種新聞資源和維基百科。對(duì)每個(gè)語(yǔ)料庫(kù)僅有的描述也就成了關(guān)于語(yǔ)境的唯一線索。從這少得可憐的信息當(dāng)中,很難推斷像“賊他媽多”這樣的詞用著合不合適。如果譯成法語(yǔ),機(jī)器可能會(huì)認(rèn)為beaucoup或者plusiers都是很好的選擇。這些詞也許可以達(dá)意,但卻喪失了喜劇效果,而這種效果更依賴于帶有社會(huì)效應(yīng)的“賊他媽多”一詞,而非中性的plusiers。不管算法有多復(fù)雜,它也得依賴于已有的信息和線索,至于語(yǔ)境,尤其是交際語(yǔ)境,則很難通過編碼來傳達(dá)。

      人腦實(shí)在是太復(fù)雜了。我在麻省理工學(xué)院讀書時(shí),這個(gè)問題就橫亙?cè)趧?chuàng)造人工智能的各種努力之前。要想和人類一樣智能,不僅僅是能夠根據(jù)規(guī)則進(jìn)行邏輯推理,或是根據(jù)規(guī)律進(jìn)行數(shù)據(jù)演算。在此之前,還得知道哪些規(guī)則是可用的,這得具有一種能敏銳覺察當(dāng)時(shí)情況的藝術(shù)能力才行。程序員都很聰明,但是還沒有聰明到可以預(yù)估意義賴以產(chǎn)生的龐大語(yǔ)境。所以即使是最好的算法,也會(huì)有所缺失——所以正如亨利·詹姆斯所說,理想的譯者應(yīng)該“一無所失”。

      這并不是說機(jī)器翻譯毫無用處。很多翻譯工作都只是例行公事而已。有時(shí),機(jī)器完全可以勝任。但可別指望多大的奇跡,比如貼切的文學(xué)翻譯,或者恰當(dāng)?shù)恼蚊钫Z(yǔ)。人工智能的研究從一開始就太過自信。我這么說并不是因?yàn)閾?dān)心失業(yè):我已經(jīng)不搞翻譯了,最近正抽空寫代碼呢。

      1. voluble: 健談的。

      2. itinerary: 旅行計(jì)劃,預(yù)定行程;shitload: 許多,大量。

      3. jovial: 熱情友好的,天性快活的;synonym: 同義詞,近義詞;bland:平和的,溫和的。

      4. syntax: 語(yǔ)法,句法;appropriateness:合適,得體。

      5. corpus: 語(yǔ)料庫(kù)。

      6. sequence: 順序,先后次序。

      7. algorithm: 算法;syllable: 音節(jié);navigate: 導(dǎo)航。

      8. vector: 向量。

      9. bristle: 顯得憤怒。

      10. feat: 業(yè)績(jī),功績(jī);virtuosity: 精湛技巧。

      11. wetware: 濕件,計(jì)算機(jī)專用術(shù)語(yǔ),指軟件、硬件以外的其他“件”,即人腦、大腦神經(jīng)系統(tǒng);intuit: 憑直覺知道。

      12. blunder: 跌跌撞撞,出漏子;inadvertent: 無意的,非故意的。

      13. scanty: 不足的,勉強(qiáng)夠的。

      14. render:(用不同的語(yǔ)言)表達(dá),翻譯;utterance: 表達(dá),表述。

      15. devilishly: 非常,極其。

      16. artificial intelligence (AI): 人工智能。

      17. felicitous: 恰當(dāng)?shù)?,貼切的;aptly: 適當(dāng)?shù)?;zinger: 妙語(yǔ),幽默的話。

      18. dog: 作動(dòng)詞,意為緊隨。

      猜你喜歡
      修女語(yǔ)料庫(kù)譯者
      生態(tài)翻譯學(xué)視角下譯者的適應(yīng)與選擇
      《語(yǔ)料庫(kù)翻譯文體學(xué)》評(píng)介
      論新聞翻譯中的譯者主體性
      科技傳播(2019年22期)2020-01-14 03:05:38
      把課文的優(yōu)美表達(dá)存進(jìn)語(yǔ)料庫(kù)
      基于JAVAEE的維吾爾中介語(yǔ)語(yǔ)料庫(kù)開發(fā)與實(shí)現(xiàn)
      修女之吻
      伴侶(2014年12期)2014-04-29 22:14:49
      元話語(yǔ)翻譯中的譯者主體性研究
      語(yǔ)料庫(kù)語(yǔ)言學(xué)未來發(fā)展趨勢(shì)
      從翻譯的不確定性看譯者主體性
      愛具體的人
      做人與處世(2009年4期)2009-05-22 11:31:30
      长葛市| 岳阳市| 龙门县| 珠海市| 南川市| 云安县| 日喀则市| 沅江市| 金山区| 大姚县| 长兴县| 监利县| 延庆县| 兴宁市| 兰溪市| 淮滨县| 榕江县| 山阴县| 马关县| 鄂伦春自治旗| 富川| 柏乡县| 泸西县| 兖州市| 汉川市| 松江区| 台前县| 洪江市| 合川市| 安塞县| 晴隆县| 普宁市| 延寿县| 呼图壁县| 建瓯市| 睢宁县| 宁波市| 阳江市| 石台县| 淮安市| 元氏县|