Database computation

Avatar
  • updated
  • Answered
Is it true that if any of the log files in the corpus changes, a complete recalculation of database is needed? What happens if the calculation does not complete before the next scheduled caclculation?
Avatar
Michael
Yes, it's true. The goal of the program is to create a report that reflects current state of log files. If it detects that any log file has changed, it needs to reanalyze all logs as it cannot exclude data from old log and include data from new log. The only exception are usual text uncompressed logs with appended lines - in most cases the program should detect that there were lines added and just analyze these lines without reanalyzing all data.

If you use the built-in scheduler, it always performs analysis of one profile at a time only. So if a calculation does not complete before the new scheduled one, the second calculation is started only after the first one finishes.
Avatar
Bob H
And that brings up another topic: Once I have completed the creation of a database for, say, 2013, do I need to retain the log files from which it was generated, or can I dispose of them?

Thanks,

Bob H.
Avatar
Michael
You need to keep the log files. If the program detects that some log files have been removed, it reanalyzes data to ensure that reports reflect current state of log files.
Avatar
Bob H
So, I had totally misunderstood the structure of the database. I thought that the database incorporated all the data needed to produce reports for previous years.

It seems to me that you might want to adjust the architecture so that the log files can be completely discarded at the end of each reporting period. Even zipped, those ancient files (1-2 years old) can waste space..