File Analysis
SQLize files to analyze dumps and logs.
FileAnalysis is a simple tool that helps push data from various file formats
such as CSV, tab-delimited, Apache HTTP Logs, Log4j, Logback and others
into an embedded disk persisted H2 database. Once the data is in, you can run
proper SQL queries around the data and analyze them in a variety of ways.
Features
- Run SQL queries against structured data
- Supports pagination during queries
- Interactive tool
- Plugin mechanism to add more file formats
Inspiration
The idea is inspired from the fact that I need to analyze a lot of log files
and dumps and decipher information from them. Tools such as Excel, splunk etc
have never helped me do things faster, and thus, I always ended up writing code
to do my tasks.
I then saw a small demo video of the textql tool at https://github.com/dinedal/textqlb
The concept of the tool was fantastic, push data into an in-memory SQL store and then
run a query against the data.
I improved upon the idea to first persist the data on disk as well, so that multiple
queries could be run. Ended up adding stuff to nicely display the data for SELECT queries
as well and made sure that when the result set had hundreds of rows, we paginated the
result with user’s consent.
Once achieved, I though of extending to many more formats that I often use. And thus, it
led to the birth of FileAnalysis.
Changelog
- Added
CSVformat - comma-delimited files - Added Apache
log4jformat - Added
logbackformat - Added Apache
httpdlog format - Added
TSVformat - tab-delimited files - Added pipe-delimited format
- Added custom-delimited format
License
The library is released under the terms of Apache Public License Version 2.