What are the differences between IMERG Early, Late, and Final Runs, and which should be used for research?

The main difference between the IMERG Early and Late Run is that Early only has forward propagation (which basically amounts to extrapolation forward in time), while the Late has both forward and backward propagation (allowing interpolation). As well, the additional 10 hours of latency allows lagging data transmissions to make it into the Late run, even if they were not available for the Early (see below).

There are two possible factors which contribute to differences in the IMERG Late Run and Final Run datasets:

The Late Run uses a climatological adjustment that incorporates gauge data. In Versions 05 and 06 no adjustment has been applied. In Version 04 this was a climatological adjustment to the Final run, which includes gauge data at the monthly scale. For Version 03 the TRMM V7 climatological adjustment of the TMPA-RT to the production TMPA was used (which includes gauge at the monthly scale) because this at-launch algorithm didn't yet have any Late and Final data from which to build the climatological adjustment. The Final run uses a month-to-month adjustment to the monthly Final Run product, which combines the multi-satellite data for the month with GPCC gauge. Its influence in each half hour is a ratio multiplier that's fixed for the month, but spatially varying.
The Late Run is computed about 14 hours after observation time, so sometimes a microwave overpass is not delivered in time for the Late Run, but subsequently comes in and can be used in the Final. This would affect both the half hour in which the overpass occurs, and (potentially) morphed values in nearby half hours.

The satellite sensor difference could be examined by comparing the satellite sensor data field in the Late and Final Run datasets for each half hour. Since the gauge adjustment is a constant multiplier, a time series should show a constant ratio between the Late and the Final Runs for the entire month (except for cases where the satellite sensor is changing, just as for the ocean).

We always advise people to use the Final Run for research. The vast majority of grid boxes have fairly similar Late and Final values, so it makes sense to stick to metrics that are more resistant to occasional data disturbances. Extreme values are more sensitive to these details; medians, means, and root-mean square difference are less sensitive.