logo elektroda
logo elektroda
X
logo elektroda

PDF Issues: How to Copy Odd Characters from PDF to Word/Google Translator?

Willisek 24048 10
ADVERTISEMENT
Treść została przetłumaczona polish » english Zobacz oryginalną wersję tematu
  • #1 17516757
    Willisek
    Level 6  
    Hello,
    The topic was discussed here but I did not find an answer. As in the topic: when I copy text from a PDF file to a google translator on a website or simply to word, a whole bunch of strange bushes appear instead of the copied text.

    Anyone have a method for this strange condition?
  • ADVERTISEMENT
  • #2 17516827
    viayner
    Level 43  
    Hello,
    give an example,
    are you copying text or something "binary", is this a valid PDF document, is the browser interpreting it correctly? Missing font?
    best regards
  • #3 17516854
    Willisek
    Level 6  
    viayner wrote:
    Hello,
    give an example,


    It copies plain text in a language other than Polish and it is nothing unusual. Other pdfs can be copied without any problems, but I don't want this one, although it looks the same. PDF document as correct as possible. Neither does the browser correctly interprets nor the word text editor. These are not missing fonts like "ą" or "ę". After copying, I just get a whole string of hearts, squares, dots, question marks, etc, etc ...
  • ADVERTISEMENT
  • #4 17516900
    Anonymous
    Anonymous  
  • #5 17516921
    Willisek
    Level 6  
    Christophorus wrote:
    Check the file's properties in the PDF file viewer if there are no security settings, e.g. against copying the document content.


    There are no security settings

    Added after 22 [minutes]:

    I can copy it, but when I paste it into the browser or into words, strange stamps come out without text
  • ADVERTISEMENT
  • #6 17517014
    Anonymous
    Anonymous  
  • #7 17517016
    wojtek1234321
    Level 36  
    Maybe not entirely in the subject, but maybe something will lead to "the right path".
    I had a similar problem, just not with copying and viewing, but with printing some PDF files, it was also printing various strange characters and similar to "Chinese hieroglyphs".
    The method given here helped:

    "Delete the entry in the registry" Arial, 0 "=" Arial, 238 "Path to the entry: [HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ WindowsNT \ CurrentVersion \ FontSubstitutes]" Arial, 0 "=" Arial, 238 ""

    https://forum.dobreprogramy.pl/t/problem-z-drukowaniem-pdf-w-adobe/441958/6

    After this change, printing is "normal", there are no more "bushes" and other strange signs.
    Only with changes to the registry you should be careful not to introduce more evil than good. Before making any changes, it is essential to make a copy of the current (before changing) registry so that in case of failure or complications there is something to restore the "good" registry.
  • ADVERTISEMENT
  • #8 17518369
    Willisek
    Level 6  
    unfortunately to no avail. I am using Win 8.1
  • #9 17518388
    dt1
    Admin of Computers group
    Are you able to attach a sample file for testing?
  • #10 17521026
    spp
    Level 12  
    A Adobe Reader -> File -> Save as Other -> Text ...

    If it is greyed out and cannot be clicked, it means there is copy protection.

Topic summary

The discussion revolves around issues encountered when copying text from a PDF file, resulting in strange characters appearing instead of the intended text. The user reports that while copying plain text from a valid PDF document, the output includes symbols like hearts, squares, and question marks, rather than the expected characters. Various suggestions are made, including checking for security settings in the PDF, using different PDF viewers, and exploring character encoding issues. One user mentions a potential solution involving registry edits to resolve similar problems with printing PDFs. However, the original poster indicates that these solutions have not resolved their issue, and they are using Windows 8.1. Additional advice includes using Adobe Reader's "Save as Other" function to extract text, which may be disabled if copy protection is present.
Summary generated by the language model.
ADVERTISEMENT