logo elektroda
logo elektroda
X
logo elektroda

Vision-based AI models for translating catalogue notes - we test Nano Banana, ChatGPT-Image and othe

p.kaczmarek2 1017 19
ADVERTISEMENT
Treść została przetłumaczona polish » english Zobacz oryginalną wersję tematu
📢 Listen (AI):
  • Comparison of two screenshots translating L1050 datasheet using different AI models
    Is a catalogue note in a foreign language a problem in 2026? Today I will test whether artificial intelligence can replace a translator and translate subtitles from electronic parts specifications into English. Importantly, the whole experiment will be based on screenshots (bitmaps) rather than PDF files, so AI will have no way to make its job easier. Will this form of specification translation be of any use? Let's find out!

    The testing methodology will be very simple - I take a screenshot of the catalogue note and try to translate it into English by sending it as an attachment with a prompt to the AI. I will use the free LMArena website to run the AI models:
    https://lmarena.ai/
    After testing, I will try to subjectively evaluate and group the models according to their results.

    Note - I have placed the images as generated by the AI. If something is cropped, it means that such a bitmap was created by the model.

    Test 1 - constant-current LED controller
    Inputs:
    Chinese L1050 datasheet page with description, features, and pin layout
    translate to english



    seedream-4-high-res-fal
    L1050 IC on blue datasheet background with technical description and pin annotation
    Such a grand hallucination rather rules out this model.

    reve-v1.1-fast
    Screenshot of a Chinese LED driver datasheet for model L1050 with feature list and pinout diagram
    No translations.

    reve-v1.1
    L1050 datasheet page with tables, description, and IC pinout diagram
    Major unnecessary reworking of document, residual translations.

    chatgpt-image-latest (20251216)
    Screenshot of LED driver L1050 datasheet with features and pin diagram
    Slightly better, occasional typos. Almost usable.

    gpt-image-1.5
    L1050 LED driver datasheet with text errors and pinout diagram
    Slightly better, occasional typos. Almost usable.

    flux-1-context-pro
    Screenshot of a Chinese datasheet for L1050 IC in SOP-16 package
    No translations.

    flux-2-flex
    Screenshot of L1050 datasheet with description, features list, and 16-pin package diagram.
    Virtually no translations, except for the title above the document.

    flux-2-flex-20251231
    Screenshot of a datasheet for the L1050 LED driver chip
    Translation failure - random letters and stamps, mostly no translations.

    qwen-image-edit
    Mosaic of small pastel-colored squares in turquoise, pink, yellow, and blue tones
    Screenshot of the L1050 datasheet with Chinese text and a pin configuration diagram.
    This model hallucinates a strange background and is able to spoil the document. Useless.


    flux-2-max
    Screenshot of L1050 LED driver datasheet with English and Chinese text sections
    Vestigial translations, most are a meaningless string of letters. The headline translated.

    flux-2-pro
    Screenshot of L1050 datasheet showing description, features, ordering, and marking info
    Nonsensical strings of letters, useless result.

    flux-2-pro-20251231
    Screenshot of L1050 datasheet with garbled English and Chinese text and SOP-16 pin layout
    Screenshot of L1050 datasheet with mistranslated English text on a blue background.
    Nonsense strings of letters, useless result.

    gemini-2.5-flash-image-preview (nano-banana)
    Screenshot of L1050 LED driver datasheet with description, features, and SOP-16 package diagram
    Surprisingly the elder Banana did not want to translate anything.

    gemini-3-pro-image-preview (nano-banana-pro)
    L1050 LED IC datasheet with description, features, and SOP-16 pinout diagram
    Best result to date. Text almost all correct, occasional errors and typos, only in paragraphs are some words nonsense.

    Trial 2 - display controller
    FD650 controller diagram with LED and keyboard interface specifications and pin layout
    translate to english
    gpt-image-1-mini
    Screenshot of FD650 datasheet with Chinese text and a pinout diagram labeled FD-950
    Model reworked image, renamed layout, not translated.

    gemini-2.5-flash-image-preview (nano-banana)
    Diagram and description of the FD650 chip functions in Chinese.
    Elder Banana could not cope with the translation.

    flux-2-pro
    Datasheet page for FD650 IC showing features list and 16-pin diagram.
    The model has attempted a translation but the result is unreadable, virtually only the title is helpful - LED Driver/Keyboard Scan.

    flux-2-pro-20251231
    Screenshot of FD650 datasheet with garbled English text and pinout diagram
    Flux 2's primary keywords decrypted, but the rest are useless.


    flux-1-context-pro
    Screenshot of FD650 datasheet with Chinese text and pinout diagram labeled “Translate to english”
    This model has superimposed the inscription translate to English on the image.



    flux-2-flex-20251231
    FD650 datasheet excerpt with block description and pinout diagram
    Residual translations.

    gpt-image-1.5
    FD650 datasheet page with IC overview, features list, and pinout diagram
    At first glance very good, but the introduction from the second/third sentence onwards fell apart.

    reve-v1.1
    FD650 datasheet excerpt with features list and pin diagram
    The basics translated, but also damaged the lead-in diagram.

    seedream-4-high-res-fal
    Screenshot of FD650 datasheet showing functions and a pinout diagram
    The title may be translated, but the model has added some strange background.

    chatgpt-image-latest (20251216)
    Screenshot of FD650 datasheet with device features and pinout diagram
    Like the second one from OpenAI, it's not bad, only the introduction fell apart afterwards. In addition, I see a slightly damaged lead-in diagram.

    gemini-3-pro-image-preview (nano-banana-pro)
    FD650 datasheet page showing description, features, and pinout diagram
    Nano Banana Pro has again performed very well.

    Trial 3 - synchronous rectifier
    This time a trial with a screenshot:
    Circuit diagram and Chinese-language description of MT6706BL synchronous rectifier
    translate to english

    qwen-image-edit
    Synchronous rectifier circuit diagram with MT6706L and MT6706BL ICs
    Useless result.

    chatgpt-image-latest (20251216)
    Schematic with MT6706BL chip and description of flyback synchronous controller operation
    The basic translation is there, but with lots of typos. Synchornous?

    gpt-image-1.5
    MT6706BL circuit diagram with incorrect OCR-translated English technical text
    Same as the previous GPT.

    seedream-4-high-res-fal
    Rectifier diagram with MT6706BL IC and distorted section of translated specification
    This model has redone the background again....

    gpt-image-1
    Screenshot of MT6760BL datasheet showing descriptive text and circuit diagrams.
    Fragment of datasheet with technical text and MT3706BL block diagrams
    Residual translation. In addition, with another attempt I received a strangely cropped image.

    gpt-image-1-mini
    Diagram of two MT6706BL circuits with bridge rectifiers and capacitors
    And here what happened? A short circuit? And this is between two separate diagrams.... in addition, the model also cropped the picture.

    flux-2-flex
    Flyback power supply schematic with MT6706BL controller and technical description text.
    Again, residual translation.

    gemini-2.5-flash-image-preview (nano-banana)
    Rectifier circuit diagrams using MT6706BL with Chinese text explanation.
    No translation.

    seedream-4.5
    Application diagram for MT6706BL synchronous rectifier with technical description in English.
    Application circuit diagram of MT6706BL used in synchronous rectification
    It came out slightly better this time, but there are still shortcomings.

    flux-1-context-pro
    Block diagram with MT6706BL IC and Chinese-language functional description
    No translation.

    flux-2-pro
    Rectifier schematics with MT6706BL chip and distorted English title and paragraph
    The letters have been changed, but they don't make sense?

    gemini-3-pro-image-preview (nano-banana-pro)
    bb558ffb3
    Another success story for Nano Banana Pro.


    Final ranking of video models
    I did additional tests, but did not put any more images in the topic, because the content with several of the same nonsense graphics would be unreadable. In the end, I grouped the models according to my overall feeling, although I noticed that occasionally a particular model might do better or worse - probably the generation has some randomness factor (seed - so called).
    Right translations, occasional errors:
    - gemini-3-pro-image-preview (nano-banana-pro)
    Almost acceptable translations, but problems with some words, blurring of letters:
    - chatgpt-image-latest (20251216)
    - gpt-image-1.5
    Sometimes it explains something, sometimes it hallucinates and creates nonsense:
    - reve-v1.1
    - gpt-image-1
    Bare translation attempts, meaningless letter composition:
    - flux-2-max
    - flux-2-pro
    - flux-2-pro-20251231
    Something tries to translate, but hallucinates and rearranges images:
    - seedream-4-high-res-fal
    Can spoil the image:
    - qwen-image-edit

    In summary , only the latest Nano Banana Pro seems to give acceptable results in terms of translating images from the catalogue notes, although it still happens to have artefacts. Just behind it is still GPT-Image 1.5 and ChatGPT-Image (20251216), but it is no match for it. The rest of the models are useless, although some of them try to remake the image and some ignore the text completely.
    There doesn't seem to be much left to do with AI in this context. It seems to me that as early as 2026 there will be much better models that can handle such translations even better, and even if not, the Nano Banana Pro is still satisfactory.
    Do you see a use for artificial intelligence in the role of an image translator? Or do you know of other practical applications for the Nano Banana Pro and similar models?

    Cool? Ranking DIY
    Helpful post? Buy me a coffee.
    About Author
    p.kaczmarek2
    Moderator Smart Home
    Offline 
    p.kaczmarek2 wrote 14052 posts with rating 11874, helped 637 times. Been with us since 2014 year.
  • ADVERTISEMENT
  • #2 21796137
    fachman1964
    Level 5  
    And so very good for a machine. It will still take some time before it reaches perfection. Nevertheless, you can read the information you need from such translations, definitely better than Chinese "bushes". It has always puzzled me why some Chinese manufacturers do not immediately make datasheets available in two languages. Maybe such chips are not intended for the external market but the Chinese market? Because I don't think they don't speak English.
  • #3 21796167
    szeryf3
    Level 30  
    Artificial intelligence is learning by the day and I suspect that by the end of this year there will be a visible difference in this subject.
    It wasn't so long ago that this was black magic, and now peasants don't grasp many things without sending a query to the AI.
  • ADVERTISEMENT
  • #4 21796237
    MikeC
    Level 32  
    Mi chatgpt 5.2 still did things differently:
    Technical flyer of L1050 LED driver IC with description, features, and pin configuration

    And this one for English and pseudo Polish:

    Application diagrams using MT6706BL controller in flyback rectifier circuits
    Application diagram of MT6706BL controller with bridge rectifier and DC output
  • #5 21796292
    gulson
    System Administrator
    Nano banana the best, as usual.

    With hundreds of pages, however, it is best to do OCR from such a document, i.e. get the Chinese, and then translate yourself with the language model, already without vision.
    Of course, the text will be clean and not very arranged, but the efficiency is very high.
    It is also possible to first use a model, which will split the PDF page into images, tables and text and insert them separately into the models for translation.
    This is also the cheapest solution for many pages.

    Probably in the future to translate documents, as nano banana did above, a lot of computing power will be needed -- because it's actually generating the whole page anew.
  • #6 21796336
    Mateusz_konstruktor
    Level 37  
    gulson wrote:
    Probably in the future there will be translation of documents, as nano banana did above, a lot of computing power will be needed -- because it is actually generating the whole page anew

    In my opinion, artificial intelligence will also lead to a standard in China for the use of English in electronic component documentation. By a circuitous route, but nevertheless this is the aspect I see most in this whole subject. There will be some rationalisation, although not in all cases. Quite simply, descriptions in English are much more useful when dealing with manufacturers, and usually the Chinese language is a huge impediment.
  • #7 21796376
    gulson
    System Administrator
    The idea of a global language has been around for quite a long time, so far nothing has changed. The indicated documentation can be released in English in parallel, now they got a tool where it is done more simply.
    Or maybe they will even be translated "on the fly" ?
  • #8 21796455
    p.kaczmarek2
    Moderator Smart Home
    @fachman1964 my sense is that with them there is often vestigial and untranslated documentation, even when they make it available. In the SDK for Beken and other IoT chips it is similar.

    @MikeC I think your results are slightly better than what I had. Could it be that it's a different model than on LMAren?

    @gulson so more specifically it's the Nano Banana Pro - the version without the Pro is weak.
    Helpful post? Buy me a coffee.
  • ADVERTISEMENT
  • #9 21796459
    Mateusz_konstruktor
    Level 37  
    @gulson
    It is not a global language, but the equivalent of a technical drawing, i.e. a language that is universal in its assumptions and understood by everyone regardless of the mother tongue used.
    English was and is used as an international language, but without the global attribute.
    In my opinion, there will be, although probably in a way that will be difficult for the average end-user to notice, a shift by many Chinese manufacturers to producing documentation in English instead of Chinese. Artificial intelligence will lead to pressure and de facto force a significant proportion of English. In contrast, this will happen as a result of activity from a completely separate area: marketing.

    There is already voice-to-text conversion, image search and online translation.
    This appears to be the natural order of things.
    The question: will it again take a dozen or more years?
  • #10 21796468
    p.kaczmarek2
    Moderator Smart Home
    What I am wondering is what is the actual cost of such a reworking of one image by Nano Banana Pro. Say, this situation from my presentation. Has anyone seen such information somewhere? A quick web search showed me this post:
    Screenshot of Reddit post showing Nano Banana Pro API pricing and comparison with other image APIs
    https://www.reddit.com/r/Bard/comments/1p7qel..._banana_pro_api_pricing_complete_breakdown_8/

    So much for the cost of the API (with us), and we don't know how much Google actually costs it, and how much Google adds to itself to have a profit....
    Helpful post? Buy me a coffee.
  • #11 21796510
    Mateusz_konstruktor
    Level 37  
    @p.kaczmarek2
    Any rates, even those actually paid, are not authoritative.
    Here, the decisive factor is something other than the actual cost.
    The end-user price is the result of an activity that has as its objectives the resultant of many objectives and considerations, with particular emphasis on the factors generated by those who own the tools. The situation is analogous to the price of the proverbial bread available on the shop shelf. At first glance we have the conclusion: "after all, they have to make money on it". However, the sale of bread is not only unprofitable, it can be an additional cost. On top of that, it can be a cost that is openly and deliberately created, for example to achieve a business plan when selling cheese. Profit itself may not be the objective, nor may it necessarily be in the field of interest. While it may be interesting to discuss offer prices, it is important to distinguish between the two.
  • #12 21796619
    PPK
    Level 30  
    Something seems to me that in the case of Asian 'bushes', AI rather searches on the manufacturer/other sites for a ready-made English translation...Mainly the complete ones....
  • #13 21796641
    MikeC
    Level 32  
    PPK wrote:
    Something seems to me that in the case of Asian "bushes", AI rather searches on the manufacturer's/others' websites for a ready-made English translation...Mainly of the complete ones...

    It translates from these bushes without any problem ...
  • ADVERTISEMENT
  • #14 21800106
    p.kaczmarek2
    Moderator Smart Home
    Test with Polish.
    translate all to polish
    L1050 chip datasheet showing features, description, ordering info, and SOP-16 layout

    gpt-image-1.5
    Technical datasheet for L1050 LED driver with features, description, and package diagram
    L1050 LED driver datasheet with features, order info, and pinout diagram on blue background

    gemini-3-pro-image-preview-2k (nano-banana-pro)
    Diagram of L1050 IC with technical data and feature summary
    Diagram and specification sheet of LED driver L1050 with technical and ordering info
    Helpful post? Buy me a coffee.
  • #15 21800127
    Mateusz_konstruktor
    Level 37  
    Why do images two and three have the left and right sections cut off?
  • #16 21800141
    p.kaczmarek2
    Moderator Smart Home
    All in all, I think we've talked about this before, but this is how the AI generates. Below is a short video of what it looks like to me:



    gpt-image-1.5 cuts off images for me always, no matter if I download or copy.

    Do you also experience this problem?
    Helpful post? Buy me a coffee.
  • #17 21800166
    Mateusz_konstruktor
    Level 37  
    p.kaczmarek2 wrote:
    Are you also experiencing this problem?

    I encounter such a problem in cases of incompatible settings or incompatible web browsers themselves.
    Parts of web pages sometimes get cut off as a result, and this happens especially with unusual settings.
    Can a colleague provide the address of a website where this can be verified?
  • #18 21800340
    p.kaczmarek2
    Moderator Smart Home
    Mateusz_konstruktor wrote:

    Colleague will you provide the website address where the above can be verified?

    The website address is the same all the time, as in the first paragraph:
    p.kaczmarek2 wrote:
    into English by sending it as an attachment with the prompt to the AI. I will use the free LMArena website to run the AI models:
    https://lmarena.ai/


    Mateusz_konstruktor wrote:

    I encounter such a problem in cases of incompatible settings or incompatible web browsers themselves.
    Parts of web pages sometimes get cut off as a result, and this happens especially with unusual settings.

    Somewhere in the fifteen years I've been doing the frontend and backend, and I haven't encountered the browser truncating the downloaded (source) image, how could that work? The browser does not do a discrete crop operation on the resource for which it sends a GET request. Clipped images can only be on the web page itself (e.g. img tags), but then "open target element" will still show the unclipped image, unless it's the backend that's already sending it clipped.

    To clarify - I'm talking about the Response field from the GET request:
    Vision-based AI models for translating catalogue notes - we test Nano Banana, ChatGPT-Image and othe

    The problem you write about could exist if I did a "print screen" instead of saving the image, but.... after all, that would even be more clicking than a simple "Save as".
    Helpful post? Buy me a coffee.
  • #19 21800409
    Mateusz_konstruktor
    Level 37  
    p.kaczmarek2 wrote:
    I've been doing frontend and backend for fifteen years and I haven't encountered a browser clipping a downloaded (source) image, how could that work?

    This could be linked to the frame size of the item depending on the screen size and then automatically adjusting the dynamically generated image. Sometimes a slightly unusual web browser or some setting is sufficient. I have encountered such a phenomenon a few times on the websites of electronic component manufacturers from completely exotic regions for us.

    And isn't there, in this "free" variant, a limit on the size of the images?
    Maybe exceeding the limit causes this clipping?
    How about trying it with the same image, but saved at a lower resolution?
  • #20 21800480
    willyvmm
    Level 31  
    If the attached images are exactly what AI got, then I'm not surprised.
    Shit in => Shit out.
📢 Listen (AI):
ADVERTISEMENT