The massive data gathering part should only be part of the learning phase of the system imo, once it get a good model of reality it should infer useful knowledge information from few data, like an expert.