July 2025 - Word Count Data, Top Authors and Musings on what data can't teach me
Intro
When I started looking into blockchain data by word count for posts, I wasn't sure what I really wanted to achieve. There is a difference between pure reporting, and insights, and I like to think the feature engineering I have performed over the month gives us some insights, but bringing to light data that wasn't present before hand.
Features Built This Month
I have, through this month (and a little bit more) of work, made the following metrics available, that were not present in the raw HIVE SQL database:
- Buckets for Word Counts
- Pay Per Vote
- Pay Per Word
- Pay Per Image
- Count of Images in Post
- Post Outcome (Above/Below Avg Words & Above/Below Avg Pay)
- Detecting Swearing in Posts
- And a few others
I have used a combination of PowerQuery M Code, and DAX to dervice the above things.
This post is about sharing the month that was, with as many insights as I could possibly think of, without completely overwhelming HIVE SQL, or my own poor brain. There have been some challenges in the data set, and there are many accounts (such as curation aggregation posts, spam, bots, burn mechanisms, and others that I have filtered out of the data to ensure I am getting as many human authors featured in this data as possible.
So, for the month of July, what have we achieved?
Not a bad effort. That's a lot of words, a lot of pictures. To put the words into perspective, if an average novel is 85,000 words; that's 445 novels published onto HIVE. 14 novels a day. :) Impressive!
But, it took 7,124 authors to write that content, so, on average, perhaps each author wrote 6% of a novel. If we follow that trend, and authors continue to submit content to hive, that means, an average author will write an average novel on Hive in about 16 months :) Not bad!
But not everyone writes fiction, or wants to write a novel.
There are a HUGE amount of images on HIVE, 795,709 - and we'll go into these numbers later.
Payment figures are also presented in the graphic above, and in the month, around $223k was paid out. The average payout was $2.59.
But as I said at the start of this post, I want to focus on the features that I engineered into the data, which is to be able to look at it by word count:
I will try to let the data speak for itself here:


Word Count and Image Insights
The below is a markdown table, it is a bit wide, you might need to scroll sideways. It breaks down posts by their word count, then a number of other metrics.
Word Count | Posts | % of Content | Total Pay | Max Pay | Avg Pay | Median Pay | Authors | Avg Words Per Picture | Images Found | Avg Images |
---|---|---|---|---|---|---|---|---|---|---|
< 50 Words | 14212 | 16.51% | $64,995 | $196 | $4.57 | $0 | 1771 | 5.46 | 38208 | 2.69 |
< 100 Words | 9726 | 11.30% | $8,523 | $118 | $0.88 | $0 | 1601 | 26.97 | 41300 | 4.25 |
< 250 Words | 17301 | 20.10% | $18,704 | $59 | $1.08 | $0 | 2698 | 54.75 | 104857 | 6.06 |
< 500 Words | 17998 | 20.91% | $36,064 | $82 | $2.00 | $1 | 3250 | 98.63 | 169239 | 9.4 |
< 750 Words | 12491 | 14.51% | $40,094 | $88 | $3.21 | $2 | 2587 | 142.58 | 156412 | 12.52 |
< 1000 Words | 5774 | 6.71% | $21,585 | $64 | $3.74 | $2 | 1741 | 156.00 | 96574 | 16.73 |
< 1500 Words | 5394 | 6.27% | $22,207 | $122 | $4.12 | $3 | 1339 | 165.63 | 115381 | 21.39 |
< 2000 Words | 1574 | 1.83% | $6,203 | $86 | $3.94 | $3 | 560 | 222.29 | 38038 | 24.17 |
< 2500 Words | 693 | 0.81% | $2,766 | $56 | $3.99 | $2 | 251 | 328.83 | 20750 | 29.94 |
> 2501 Words | 904 | 1.05% | $2,084 | $84 | $2.31 | $0 | 168 | 1550.87 | 14854 | 16.43 |
Top Authors by Various Metrics
Again, here, I am letting the data speak for itself. If you want to know a particular author's stats, let me know, and if I get time, you might get a reply - as I will not be able to upload the whole data set here, it is enormous and I don't want to crash your browser. :)
Most Posts
Most Replies
Most Pay
Highest Max Pay
Highest Average Pay
Most Words Published
Most Swearing
Highest Average Word Count
Most Images Posted
Highest Average Images
Highest Pay Per Word
What I have Learned
When I started looking into this data, my hypothesis was that longer posts should get more rewards. What I have seen by looking at this data over the last six weeks or so, is that this is not the case.
As is the case with creative content such as blogs, travel logs, photography, art, music, fiction, philosophy, science, homesteading, code snippets, or the vast other types of content that people post on the platform, there is no single indicator of quality that can be programmatically determined.
We would need to compare like with like, within communities, and remove so many different variables.
There are people who post elaborate signatures, full of referral links, or full of images. There are others who use few words, but use images powerfully. There are others who post their video content, their gaming streams.
Some plump up their word count by providing content in a bilingual format - when we all have browsers that can translate with a single click. Some repeat pictures in these posts, or use layouts with columns.
There is a vast variety of content. We can't compare content by any single metric. We have to look at it with our own eyes, and vote for it with our own standards and sensibilities. That is what makes this platform so good - we can search for what we like, and we can appreciate what we like.
Comments on Purpose
Everyone has a different purpose for hive. For me, I think it will be to step away from the data for a little bit. I wanted to sharpen my skills using PowerBI and Power Query, as this is what I have been doing in a professional job previously, and I don't want to forget how to use the software.
I will be using my time to focus on writing my short story anthology, rambling posts, and a bunch of other, and various things as my interests flit from thing to thing. And well, I want to play some games, too, because I miss gaming. I want to read books, and not continue to go down so many rabbit holes in this data set, now that I have learned some lessons along the way.
I hope that other people found the data useful - but coming back to purpose, please, tell me - What is your purpose for HIVE - are you writing a novel? Are you posting photographs? Are you posting your Art?
What are you passionate about? Because that is one thing that the data cannot tell me.
Before I reached that bottom part of the post I was going to say it'd be interesting to see a similar post but on people that earn the least. Adding the factor of something like rep and whether they've been recently downvoted to filter out the bad actors.
There's a lot of reports on the top earners and it rarely changes, but it'd be interesting to see a list of people Hive ignores and try to support them a bit more too. Especially when big curators basically just stick to their friends and the same groups of people.
Yeah, I second this idea!!
Since I have the hunch this different analysis would provide at least a change of perspective about how things and people's behavior should work in Hive.
I think it wouldn't change much, because people need to read posts for it make any difference.
Not a jab at you, you've clearly read what I've written, but for so many, I wonder if they even open up the browser beyond their own posts and their own comments sections.
I feel so attacked XD
not really, I'm just disorganised and struggle with where to put time
I love the idea of time as an object. What a beautiful sequence of four words.
I know that's not what you meant, but us creatives, we have our delusions
Kind of is what I meant, I see it as some combination of an abstraction like digital money and something that flows similar to water or wind or current ^_^;
Same wavelength, different ocean :)
That list would be very long, with the median post being $0 across so many of these categories.
Rep and downvotes would be an enormous dimension to add to the data set, and we can generally see that with our own eyes.
A "reverse new" feed would be awesome to see, to see a feed of posts that "are about to pay out" - like ebay's "ending soonest" on auctions, so you could get a real impression of the things that are about to languish, unrewarded, or something that is about to get a potentially enormous, "undeserved" reward.
Interesting data, thanks to this post of yours I found out that I am among the highest paid authors but I am not among the highest paid per word, I am among those who write the most words per post and I am in the top three positions among authors who use the most images in posts.
Only for a month! Does having access to the data help you re-assess your writing or style?
I would encourage that people do what they do for themselves - as can be seen, the data doesn't really teach us much other than the existing patterns.
Do you use a lengthy signature in your post with lots of images? That is a contributing factor to image count, as well. I tried to devise a methodology to try and remove images that are used multiple times - but it was far too computationally intensive, so I abandoned that idea.
It does seem that people value longer posts here whilst that is rarer elsewhere. I think we need a mix to satisfy a wide audience. My own posts are a mix. I don't do a lot of photos, but they will usually be my own and I don't tend to use 'AI' art. For things like #FollowFriday I try to aim for around 500 words. I don't expect people to spend more than a few minutes on a post by me, but if someone does a deep dive on something I'm interested in that I will take my time to read it.
I yearn to find those deep dives. They're rare, so I try to write them all myself :D ... but I am not a typical user of HIVE, I don't think.
Regarding the AI art - I like to try and use my own images, but for me, the words come first, and then finding a picture to try and illustrate that point (be it from my own library - or the stock images available, or the myriad, unknownable depth of "AI") - I will tend to use a combination of whatever my mind deems suitable at a time.
The important thing is reading the words that others say, and reading between the lines, where their writing has enabled such sophistication (or perhaps naivety)
LoL. ¿Don't you find this interesting? The current level of ignorance, stupidity and greed in the Hive blockchain by most of its herd members following blindly the stupider rules of some communities demanding "bilingual posts" in order to be "curated" and worth of an upvote?
I think that this is an education problem. English is probably better for SEO, but the world isn't painted in the same brush. We lose everything in the translation, but I guess, with people who don't know that they can click a single button - and have a translation available... I'm not sure what we can do about that, as I said in reply to @harbiter -
I can only speak three languages - English, whimpering, and whatever those sounds are that I made as an infant.
The bilingual thing makes me think... It's true what you say, that we can easily translate a post in our nativa language. BUT when I personally write a bilingual post, I see that often the translator miss some specific expressions and I must find an alternative way to write them in english. That's why I always check my translations. If I didn't do that, trust me, you'll see some sentences here and there that have totally no sense 😅
I very much respect true polyglots - I can only speak two languages - English, and whimpering. Three, if you count whatever language we all had in common as infants. However, the mere fact that you understand that there is differences and sometimes, no other way to truly express the same thing in different languages in the same way is fundamental to your understanding of what it is to actually communicate.
To, instead of just brute force the translation - but to treat it with nuance.
Good to check, but hey, I do tend to use that click translate button from time to time, and it is enough for the general purpose of a post to come across. Where I become conflicted, and feel a sense of absolute entitlement, is where I want to respond - should I do it in English, or should I write it in English, then translate to the language the origiinal, then hope my message isn't lost - or that people think I have a mastery of other languages.
Yes I understand what you're saying. I keep writing in English because it's all practice for me :) But when I'm writing very long posts, it's much easier and quick to write in Italian and then translate and check. I should do the "quote" thing to put translated parts when needed, instead of repeating the whole post though
All my bugbears in one paragraph.
Okay not all of them.
Every single bug bear in a single paragraph would be an impressive feat of economy with words.
I laughed in complete unsurprise that Galen got most swearing XD