Sometimes we find here in Brazil some imported data from +4 months ago, that nobody saw until now. Usually, these imports are followed by some other changesets deleting the old data + changesets modifying/adjusting the imported data.
We also see some changesets where people purposely/unconsciously delete a lot of data.
Could a Bayesian filter, SVM or something else be used to classify a suspect changeset? Could we use something smart for this task?