In 2001, I had access to the server logs of a Tonigy website. I tried several ways (at least Moglan, Webalizer, Analog, ALA, and hypermart.net service) to analyze them, but I didn't like the results. So I decided to write my own web log analyzer.
Although I wrote Tonigy in the C language, I decided to use the Java language. Java had a much more advanced standard library than C. In C, even simple string manipulation was too verbose.
I was also writing in Perl at the time, but I was never a fan of that language. I didn't see Perl as a language for anything more complex than a few CGI files.
In 2001 I created an analyzer called Webolog, which I never published. But I used it a lot in my work. It was a command line tool for Java 1.1+. It takes logs in Common or Combined Log Format, extracts data, collects it, and generates a bunch of HTML files (with statistics).
I do not have logs that old, but I found screenshots from May 2001. Funny, they show MS IE as the browser.
Total visits:
"Referers":
Search queries (from "referers"):
It is possible to configure query extraction. The search sites at that time:
===
web.altavista.com qwww.altavista.com qwww.lycos.com querywww.google.com qwww.google.fr qwww.google.de qwww.search.com qtgoogle.yahoo.com psearch.163.com keyen.os2.org query===
Browsers used:
And OS used:
Agent data extraction is also configurable with RegEx.
===
; Opera"Opera/(\S+) \(.*Windows 9(?:5|8).*\)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows 95/98""Opera/(\S+) \(.*Windows NT 4\.0;.*\)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows NT""Opera/(\S+) \(.*Windows NT 5\.\d;.*\)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows 2000""Opera/(\S+) \(.*Linux.*\)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Linux""Opera/(\S+) \(.*OS/2.*\)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "OS/2""Mozilla/\S+ \(.*Windows 9(?:5|8).*\) Opera (\S+)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows 95/98""Mozilla/\S+ \(.*Windows 2000.*\) Opera (\S+)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows 2000""Mozilla/\S+ \(.*Linux.*\) Opera (\S+)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Linux""Mozilla/\S+ \(.*Windows 3\.10.*\) Opera (\S+)\s+\[[a-z]{2}\]" "Opera" "Opera $1" "Windows 3.1"; Internet Explorer"Mozilla/\S+ \(.*MSIE (\S+);.*Windows 3\.1\)" "Internet Explorer" "Internet Explorer $1" "Windows 3.1""Mozilla/\S+ \(.*MSIE (\S+);.*Windows 9(?:5|8).*\)" "Internet Explorer" "Internet Explorer $1" "Windows 95/98""Mozilla/\S+ \(.*MSIE (\S+);.*Windows NT 5\.\d.*\)" "Internet Explorer" "Internet Explorer $1" "Windows 2000""Mozilla/\S+ \(.*MSIE (\S+);.*Windows NT(?:| 4.0).*\)" "Internet Explorer" "Internet Explorer $1" "Windows NT""Mozilla/\S+ \(.*MSIE (\S+);.*Mac_PowerPC.*\)" "Internet Explorer" "Internet Explorer $1" "Mac/PowerPC"; Netscape Navigator"Mozilla \(OS/2; (?:I|U); OS/2 Warp\)" "Netscape Navigator" "Netscape Navigator" "OS/2""Mozilla/(\S+) \(.*OS/2.*\)" "Netscape Navigator" "Netscape Navigator $1" "OS/2""Mozilla/(\S+) \(.*Linux.*\)" "Netscape Navigator" "Netscape Navigator $1" "Linux""Mozilla/(\S+) \(.*Macintosh.*68K\)" "Netscape Navigator" "Netscape Navigator $1" "Mac/68K""Mozilla/(\S+) \(.*Macintosh.*PPC\)" "Netscape Navigator" "Netscape Navigator $1" "Mac/PowerPC""Mozilla/(\S+) \[[a-z]{2}\].* \(.*OS/2.*\)" "Netscape Navigator" "Netscape Navigator $1" "OS/2""Mozilla/(\S+) \[[a-z]{2}\].* \(.*Win9(?:5|8).*\)" "Netscape Navigator" "Netscape Navigator $1" "Windows 95/98""Mozilla/(\S+) \[[a-z]{2}\].* \(.*WinNT.*\)" "Netscape Navigator" "Netscape Navigator $1" "Windows NT""Mozilla/(\S+) \[[a-z]{2}\].* \(.*Windows NT 5\.\d.*\)" "Netscape Navigator" "Netscape Navigator $1" "Windows 2000""Mozilla/(\S+) \[[a-z]{2}\].* \(.*Linux.*\)" "Netscape Navigator" "Netscape Navigator $1" "Linux""Mozilla/(\S+) \[[a-z]{2}\].* \(.*SunOS.*\)" "Netscape Navigator" "Netscape Navigator $1" "SunOS""Mozilla/(\S+) \[[a-z]{2}\].* \(.*AIX.*\)" "Netscape Navigator" "Netscape Navigator $1" "AIX""Mozilla/(\S+) \[[a-z]{2}\].* \(.*IRIX.*\)" "Netscape Navigator" "Netscape Navigator $1" "IRIX"...
===
Webolog collects data in a simple database. For simplicity I didn't use a RDBMS. Instead, the data is stored in a ZIP file containing several binary and XML files. Not very efficient, but it was enough.
Then Google Analytics became usable and even popular, so I stopped using Webolog.
See also related notes:
- Loop counter name: "t" vs "i" (2024-02-21)
- Tonigy source code [OS/2, 2001-2002] (2023-12-03)
- Tonigy 20+ лет (2021-12-03)
Image albums:






0 comments:
Post a Comment