logo elektroda
logo elektroda
X
logo elektroda

Excel - remove duplicates while preserving the unique data in the joined row

trekkers 3264 3
ADVERTISEMENT
Treść została przetłumaczona polish » english Zobacz oryginalną wersję tematu
  • #1 16296222
    trekkers
    Level 10  
    Data sheet, over 100,000 rows, more than 50 columns. It's about getting rid of duplicates while keeping your data unique. For example, I create an additional column by combining first name, last name, company, I rank AZ, then I mark conditional formatting and mark duplicates, then filter by color (which with this amount of data will probably take about an hour) and only deal with duplicate data.
    So this is a person who, for example, in one found row has a given position, but in another there is no more, so by deleting a duplicate row I lose this data, and I would like to combine these rows into one (sometimes they are 3 duplicates). Well, there may be a situation where the data differ, because e.g. the current name of the dog is Bobik and not Azor, then the result from the line, e.g. which is above (or a different criterion), could be saved.
    I hope I wrote it as understandably. Can it be solved with some macro or function, or simply excel cannot be done in this case?
    I have attached an overview sheet
  • ADVERTISEMENT
  • #2 16296691
    nunak
    Level 14  
    And so I ask, built-in remove duplicates?
    What in the case of, say, Jan Kowalski who has Azor's dog, and Jan Kowalski, who has Bobik's dog?
    Is this the same Jan Kowalski?
    You set the scope of the duplicate check as you like, whether it should refer to one column or, for example, to all.
    Removing duplicates goes quite fast.
  • ADVERTISEMENT
  • #3 16297189
    trekkers
    Level 10  
    I assume that Jan Kowalski is the same person, but the data may come from different periods (it doesn't matter so much here). My point is that the data should not be lost, i.e. in one row, John Kowalski has the position given and in the other, he is gone, so when deleting one row, I lose some information, and this function, remove duplicates, will not merge the rows into one. In addition, the number of rows and columns is enormous and the differentiation of filling the cells with data is equally huge.
  • #4 16297585
    marek003
    Level 40  
    :arrow: trekkers And you've tried at least to apply your colleague's advice nunak 'a?

    What version of excel do you have? If 2010 and above you have the "remove duplicates" feature.
    Just do not select one column, all those that are to be individual.

    If you do not believe in the results, do a test with, for example, three columns and data. type in sample values, select the whole table and press "Remove duplicates" in the tab / ribbon (I don't know what they say now) "Data.

    Data example (data in separate columns):
    a d a
    b a b
    a b a
    c a c
    a c a
    b a b
    d b d
    a d a

    From the above, after selecting all the data, the "Remove duplicates" function will cut two rows-duplicates: a, d, a and b, a, b
ADVERTISEMENT