The demise of LJ?

Mar 16, 2006 11:43

I posted this over in lj_research, but I thought it might be interesting to those of you who read this.



flying_blind made an interesting comment in regards to my last post: maybe the changes we're seeing are a result of LJ purging deleted journals.

So I went and downloaded the last six years worth of stats.bml from archive.org. (Note: most website rippers won't rip from archive.org, so I just took the page and cleaned it up so I had a list of URLs, and then I
ran this python code:

import urllib
for line in open("internet-archive.txt",'r'):
f = open(line.split('/')[4],'w')
f.write(urllib.urlopen(line).read())
f.close()
to download all the URLS.)

Then I wrote
some quick code to extract out the various numbers, baring in mind that the labels for the text change over time. this is really here so if someone wants to continue this work and look at other data such as countries, they can adapt this code to their uses.

#given a directory archives/ of stores of livejournal.com/stats.bml
#taken from archive.org this program does its best to extract the data
#to a comma-separated file. requires they have the original filenames
#in the format yyyymmddvvvvvv. it throws away all the vs.
#
#hint: when importing to excel, once you've told it it's comma separated,
#tell it to make the first column format "date" (there's a radio button)
#with format YMD.

import dircache,string, re

out = open("aggregate.csv",'w')
nums = re.compile('\d+')
out.write('date,total,active,ever-updated,thirty-days,seven-days,twentyfour-hours\n')

for filename in dircache.listdir('archives/'):
print filename
total = active = ever = thirty = seven = twentyfour = '0' #reset all the values so we dont get old data
for line in open("archives/"+filename,'r'):
if "total users" in line.lower() or "total accounts" in line.lower():
total = nums.findall(line)[0]
if "active in some way" in line:
active = nums.findall(line)[0]
if "ever updated" in line:
ever = nums.findall(line)[0]
if "updating in last 30 days" in line:
thirty = nums.findall(line)[1]
if "updating in last 7 days" in line:
seven = nums.findall(line)[1]
if "updating in past 24 hours" in line:
twentyfour = nums.findall(line)[1]
out.write(filename[0:8]+','+total+','+active+','+ever+','+thirty+','+seven+','+twentyfour+','+"\n")
out.close()


raw data:

date total active ever thirty seven twentyfour
9/25/2000 16673 0 11463 7257 4869 1956
11/21/2000 27149 0 18959 11146 7645 3789
12/8/2000 31207 0 21974 13026 8951 4467
3/8/2001 70760 0 51682 31841 21779 10527
4/13/2001 99218 0 73386 45068 30294 13920
10/11/2001 366272 0 279667 128120 89373 41968
11/16/2001 394998 0 306402 138186 99587 46963
2/6/2002 448914 0 356712 166011 121386 50211
2/24/2002 466351 0 373137 175183 129386 56916
6/2/2002 579827 0 467184 218073 157996 62194
10/14/2002 737975 0 605617 269447 192217 67867
6/2/2003 1079730 499233 909892 413602 291296 119504
8/1/2003 1208073 576000 1023854 472164 333676 147775
6/9/2004 3171504 1672790 2480173 1153683 756888 292198
6/10/2004 3171504 1674960 2480173 1153683 756888 292198
6/11/2004 3171504 1674878 2480173 1153683 756888 292198
6/15/2004 3171504 1686308 2480173 1153683 756888 292198
6/16/2004 3171504 1690559 2480173 1153683 756888 292198
6/18/2004 3171504 1699696 2480173 1153683 756888 292198
6/19/2004 3171504 1699540 2480173 1153683 756888 292198
6/22/2004 3171504 1705645 2480173 1153683 756888 292198
6/23/2004 3171504 1712938 2480173 1153683 756888 292198
6/24/2004 3171504 1714488 2480173 1153683 756888 292198
6/25/2004 3171504 1711651 2480173 1153683 756888 292198
6/29/2004 3171504 1709450 2480173 1153683 756888 292198
7/1/2004 3171504 1715591 2480173 1153683 756888 292198
7/6/2004 3651180 1716263 2808780 1226493 751602 299276
7/7/2004 3708697 1720146 2848320 1229306 768593 301153
7/8/2004 3708697 1721853 2848320 1229306 768593 301153
7/10/2004 3734146 1724404 2865482 1232715 774025 296352
7/13/2004 3755616 1724404 2880270 1232685 778415 247829
7/15/2004 3793626 1724404 2906023 1240660 797641 309296
7/15/2004 3806451 1724404 2914862 1241991 799913 307874
7/16/2004 3818758 1724404 2923299 1242791 801032 307276
7/16/2004 3818758 1724404 2923299 1242791 801032 307276
7/18/2004 3829532 1724404 2930764 1242020 799332 286878
7/21/2004 3877236 1724404 2963288 1247376 809530 311005
7/23/2004 3890123 1724404 2971892 1248005 810453 311698
7/26/2004 3924050 1807520 2995305 1254403 810511 255379
7/27/2004 3935830 1812705 3003314 1259093 813856 284534
8/10/2004 4118372 1874669 3127992 1305636 833333 321687
8/12/2004 4144810 1881331 3146147 1312026 839300 326662
8/14/2004 4169155 1887512 3162602 1315046 840630 308707
9/13/2004 4496864 1852380 3382074 1345340 851381 281565
10/9/2004 4766134 1926709 3563993 1333508 853269 286112
10/13/2004 4808384 1939882 3592951 1340867 865762 331466
10/15/2004 4818837 1942816 3599854 1341397 867028 322569
10/15/2004 4824896 1941188 3604865 1339832 868549 322589
10/16/2004 4835073 1941600 3611549 1339302 869693 287571
10/19/2004 4868434 1954138 3634077 1344709 874238 338952
10/20/2004 4877300 1953933 3640186 1343140 870426 310155
10/21/2004 4888833 1956471 3647813 1344534 874671 337610
10/22/2004 4898998 1960031 3654583 1345365 874378 317092
10/24/2004 4908170 1961438 3660896 1345414 873675 284759
10/24/2004 4918064 1961917 3667565 1345873 873259 259831
10/26/2004 4940633 1970546 3682451 1351040 871082 333389
10/27/2004 4951522 1972340 3689731 1351305 876183 329706
10/28/2004 4961585 1970245 3696444 1351425 872684 320696
10/30/2004 4971076 1986290 3702652 1351361 870591 312311
10/30/2004 4979290 1985957 3707919 1351165 865120 261423
10/31/2004 4987300 1986562 3713667 1350624 861910 249836
11/2/2004 5006445 2057934 3726298 1353917 858087 336718
11/3/2004 5006445 2059784 3726298 1353917 858087 336718
11/3/2004 5016842 2067944 3733519 1356385 863313 348931
11/6/2004 5030704 2108579 3743999 1357657 878942 327470
11/6/2004 5041005 2108579 3750770 1358420 885815 287806
11/8/2004 5051754 2114521 3757787 1359509 888432 259846
11/14/2004 5129691 2279120 3809873 1372528 892630 273112
11/17/2004 5152271 2299735 3824917 1378908 892510 344161
11/20/2004 5184246 2310829 3846140 1381078 894114 327709
11/28/2004 5270116 2341299 3905105 1393531 888948 264172
2/4/2005 5997569 2574659 4395827 1500048 973419 360274
2/7/2005 6018712 2576785 4409844 1500006 971515 299282
2/9/2005 6053779 2580194 4432523 1504864 968849 371584
2/11/2005 6065705 2580408 4437360 1502760 961438 332747
2/13/2005 6087300 2581909 4451510 1501987 960832 322325
2/14/2005 6098073 2577704 4458658 1500648 960341 297085
2/15/2005 6109556 2599982 4466026 1501195 959460 347799
2/19/2005 6156347 2607035 4493200 1510050 968111 352377
2/20/2005 6167300 2606669 4500512 1509725 968642 321491
2/24/2005 6218083 2617685 4530622 1513289 965491 365961
3/1/2005 6275035 2623325 4566763 1512152 968191 343453
3/2/2005 6286537 2623387 4573054 1512854 964598 361662
3/2/2005 6299804 2624233 4580397 1511824 965679 368324
3/5/2005 6322551 2624589 4594017 1509206 961164 347359
3/6/2005 6332616 2623840 4600950 1508044 961603 320286
3/8/2005 6354569 2626459 4614608 1508475 956604 330588
3/9/2005 6366282 2628737 4620437 1508359 954381 347003
3/10/2005 6377300 2629778 4627161 1507649 951900 355172
3/12/2005 6398982 2627253 4640888 1505658 951151 351416
3/18/2005 6462518 2637578 4683173 1511530 963976 360925
3/19/2005 6473047 2630707 4689873 1511204 962773 359118
3/26/2005 6550688 2630820 4740327 1513841 960322 344622
3/27/2005 6562190 2628423 4747649 1513675 959599 317627
3/29/2005 6585079 2633927 4762166 1515114 962805 335212
3/15/2006 9761877 1994777 6628623 1316574 803105 303107

So this is the end result. Now, note that there's no archive data after March last year -- presumably a change in robots.txt, so the last dataset I just copied by hand from the current page, and the change from that March until now is perhaps the most interesting.




I think we're seeing a key interesting thing here.

Activity and raw number of postings, in the last 30 days, 7 days, and 24 hours, has gone down over the last year. That's a very interesting result, and a little worrying to the powers that be, I'd think.

So what's going on? Maybe...

1) We're seeing less LJ use. Maybe everyone's heading over to MySpace so they can put ten thousand little icons and widgets and little midi songs in every post, and never use capitals ever again, ever. There are a lot more companies out there offering similar services to LJ than there were six years ago, and blogging has changed over that time. This is the most likely option.

2) We're seeing a change in LJ use. More people are reading, but less people are posting. We're seeing more people using LJ on a regular basis to read their friends lists, but they're posting less. So we're seeing it change to more of a broadcast medium, than the chat medium it once was. It would be interesting to look at thing like raw K of text posted over time. However, this is only possible if "active" doesn't involve counting "readng friends lists without commenting", or "reading friends lists without commenting or without even being logged in."

3) There's a strong connection to the earlier post. That is, we're seeing a shift in demographics, producing the perceived shift in #2. For example, as has been shown elsewhere, Russian LJ works on much more of a broadcast model, with some very very popular people in essentially broadcast mode with much discussion among their readers. An increase in the percentage of russian LJ users could produce this effect while showing an increase in popularity of LiveJournal in aggregate, again, subject to definitions above.

4) Finally, I wonder how much these data actually characterize the usage patterns of LJ. In some ways, LJ is a blogging site LAST: in terms of sheer number of minutes people spend interacting with the website, it's really an elaborate web-based RSS aggregator. The next most popular form of usage is the facilitation of people making comments. And only after that is it used for blogging. (This is a pretty common form on the internet. A lot of sites, from acidplanet to slickdeals are actually essentially bulletin board readers with a commenting facility, and only occasionally in comparison do people actually post new songs or bargains, respectively.)

I thought these were some interesting data, and I'd be interested to see your responses to what you see in them. I'd also be interested if anyone feels like extracting out the other demographic data and looking at those changes over time, which I just didn't get around to doing (although it's very simple to modify the code above to do so.)

jofish
Previous post Next post
Up