I have only a limited understanding of compression techniques, but I'll offer what I've got ;)
There's more than one compression technique. One is called "Run Length Compression". This is the kind used by the GIF file format. As the bitmap image is scanned (row-by-row), any "runs" of pixels with the exact same information (i.e. same three RGB color numbers) then the number of these occurrences is counted and whole run is replaced by the three color numbers and how many occurrences of them there are--which requires fewer bytes than the bytes needed to express the run, itself (assuming a "run" is defined as something like "3 or more in succession").
Another type of compression works to consolidate similar pieces of information. So, in the case of an image, if there is an area, such as a region of sky, that has a similar "signature," it is pared down to a more homogenized version that takes less data to convey. It's my understanding that one way to do this is to convert the data to the frequency domain and then run it through one or more filters (i.e. some sort of low-pass filter, for instance).
Then there is fractal compression where patterns are scanned for and replaced by codes that invoke fractal algorithms at the receiving end. These fractal algorithms mimic the patterns that they replace.
Once again, this is an overview of this subject from someone who is very much on the sidelines. I have gleaned this information from co-workers, and magazine articles. The integrity of this information is purely reliant on the integrity of the sources and on the soundness of my memory