Jargon Footprint

The concept of 'carbon footprint' is a useful way of drawing attention to profligate energy wastage (or, conversely, of boasting of one's conspicuous consumption).

However, words also have a cost. They cost energy and resources to print on paper, to store on servers or to transmit electronically. The more people access those words, the more their energy costs are multiplied.

It should be possible, therefore, to account for these costs in the same way that we can tally up the carbon cost of guacamole or of a skiing holiday.

Take, as an example, the post you are reading now. We might estimate that N people will read it over its lifetime, meaning that it will bounced around between N*S computers, consuming an amount E of electricity in transmission and display. It will also occupy X% of the server where it is being stored, hence costing X% of the total energy costs of that server. It will also consume so many seconds of so many people's time to read, and (being Z% of the total HB information) can be said to consume a certain very small percentage of Jutta's time in maintaining the HB. And so on an so fifth.

Thus, we can calculate the energy cost of this post.

Now, had I been more brevious, I could have told you all this in fewer words - perhaps only one third as many. I have therefore wasted a considerable amount of energy through my verbitude. This wastage can be captured as a "jargon footprint" which (if I could have said the same thing in 1/3rd as many words) would in this case be 3.0

The 'jargon footprint' can, like its carbon cousin, be widely applied. For example, a government office might, by issuing overly verbose documents to large numbers of people, score a jargon footprint of 10 or even 20. A sign which I recently saw on a public convenience (which read "We regret that these toilets are temporarily unavailable for cleansing - please use alternative facilities" - 14 words) would score 4.6 (the relevant message could have been put as "Closed for Cleaning").

Individuals can, on the basis of their lifetime output of words versus information, be assigned a jargon footprint. So can entire disciplines. Sociology, for example, probably has a jargon footprint of over 100, since it uses many words to convey almost no information; physics probably has a far smaller footprint.

Even languages can be assigned footprints. Comparison of multi-lingual signs invariably shows that some languages are intrinsically more concise than others. No doubt the French are particularly guilty in this respect.

Nations, organizations and individuals should all strive to reduce their wordage, driven by this common metric.

Blimey, I do go on, don't I?

MaxwellBuchanan, Dec 05 2012


       What he said.
normzone, Dec 05 2012

       As soon as you used "brevious", you made the sig at the bottom redundant.
lurch, Dec 05 2012

       Ah, but in Chinese, more bits per character are needed than in a Western language with a proper writing system. Also, compression is less easy.
MaxwellBuchanan, Dec 05 2012

       You didn't seem to factor in the additional footprint of annotations, and any subsequent ideas this one may inspire that otherwise would not have existed.   

       Think about the jargon footprint of Socrates, who being concerned about his own didn't write a thing. He should have been more careful about his carbon dioxide footprint, and kept his thoughts to himself.
rcarty, Dec 06 2012

       This is very related to information theory; seems to me that the Jargon Footprint is usually called redundancy.
piluso, Dec 06 2012

       Bah humbug. Redundancy, like anything else is relative. Take DNA - If we wanted to compress the amount of information present in a single strand of DNA, I'm sure there's plenty of really good algorithms out there that would help sort things out. Then there's the encapsulation problem. Why does the body need to have a billion-billion copies of the same piece of code? Why not reorganise the whole thing on a centralised model where a single, secure repository is responsible for holding a single master copy of the information and have each cell reference that master directly? It's like having each word in the Complete Works of Shakespeare being printed using a really tiny list of all the words (in sequence) present in the Complete Works of Shakespeare - completely unnecessary and having an informational footprint of many bazillions.
zen_tom, Dec 06 2012

       [+], but   

       //some languages are intrinsically more concise than others//   

       How does German score here? They tend to take about 5 words and bung them together as one, so although the length of any piece of writing might be the same as in another language, the Germans should generally have fewer words.
TomP, Dec 06 2012

       //How does German score here? They tend to take about 5 words and bung them together// Strictly, we ought to count characters, in which case this historically-rooted attempt by the Hun to pre- emptively get around my jargon footprint penalty scheme is thwarted.
MaxwellBuchanan, Dec 06 2012

       The thing is, [bigs], that Chinese isn't really proper writing, it it? They just made do by drawing stuff and, over the years, the drawings degenerated like a bank manager's signature. At best, written Chinese conveys the general gist of whatever it was you actually wanted to say, as long as you interpret it with hindsight.   

       This is why we in the West, who took the time and trouble to work out the whole writing thing, can say pretty much anything we want using just 26 characters (or 25 in Norfolk, where they don't use "o"), whereas the poor benighted Chinese have had to come up with several thousand different cartoons. Written Chinese is like a cross between Pictionary and charades ("bird that makes a noise like a carrot" - "parrot").   

       They've clearly put a lot of effort into the descriptions of monkeys juxtaposed with chests of drawers, but how often does this situation arise?   

       To make the situation fairer, the carbon footprint of every language can be calculated as the character count of text (relative to a page of English) multiplied by the number of characters in the "alphabet", divided by 26. Thus, English would have a score of 1*1=1. Chinese might have a score of 0.7*(1118/26)=30. Binary, as another example, might have a score 5*(2/26)=0.39.
MaxwellBuchanan, Dec 06 2012

       What about Braille?
xandram, Dec 06 2012

       Regarding compression algorithms, I suspect that all languages can be compressed by roughly the same proportion, given that some characters/symbols will be more abundant than others.   

       Regarding //Given that we can remove a fair degree of complexity from chinese characters discounting the monkey and chest of drawers glyphs I'd say its still a winner.// I disagree. How many Chinese cartoons do you need to be familiar with to read a newspaper*? Surely many more than 52. Thus, if Chinese uses only half as many characters as the English equivalent, it has a larger jargon footprint.   

       How about an experiment? I took a page of prose in English, and did automatic translation into Simplified Chinese. I then tried compressing (by ZIP) both files. The results (in bytes, before/after compression) are:
English: 2582/1557
Chinese: 5676/2020

       So, Chinese gives a bigger file, whether compressed or not. All depends on encoding, compression algorithm and text length but, broadly speaking, Chinese cartoons are lengthier than Proper English.
MaxwellBuchanan, Dec 06 2012

       Always been impressed with the economy of the ancient people of northern Britain. Very unwindy, they became well known in our time for their signage outside public restrooms.
cudgel, Dec 06 2012

       I can't cite for this but i think languages tend to have distinctive compression ratios. With Mandarin Chinese, there are a fixed number of possible spoken syllables which they are very attached to thinking of as words. Just as German chooses to capitalise certain words to distinguish meaning which it can get away with not shouting when we speak it, e.g. "Sie" versus "sie", so Chinese chooses to use a large number of ideograms for the same spoken words. Compress spoken Chinese and each "word"/syllable takes up less space. If Mandarin has five hundred possible syllables ignoring tone and four possible tones, that's eleven bits per morpheme and perhaps word. English has forty-four phonemes and the mean length of a word is around five phonemes, so the "average" uncompressed spoken English word considered as a string of speech sounds rather than sounds of a more general kind (remembering where we are and who started this place) is twenty-five bits "long". However, those words frequently consist of several morphemes, and Indoeuropean languages tend to encode more "units of meaning" per morpheme than non-IE ones, which are often either agglutinative or have isolated morphemes each considered to be words. Therefore, we Anglophones have compression built into our speech, unlike Mandarin, but interestingly, our writing system tends towards a system of ideograms masquerading as an alphabetic script, so the compression ratios of our speech and writing, like Mandarin's, are probably quite different. Also, exactly how small is a single atomic idea?
nineteenthly, Dec 07 2012

       //"encircle Wèi to save Zhào"... this 4-character summary is sufficient to make the point.//   

       Well, that's true enough except that it isn't, of course. You can tell me to encircle Wei to save Zhao all you like and, frankly, all you'll get from me is a blank stare because the statement presupposes a lot of specific knowledge.   

       On the other hand "Buy low, sell high" _is_ a self- explanatory strategy.
MaxwellBuchanan, Dec 07 2012

       Ah, but that was _my_ point in the original post. You measure the length of a given piece of text, and compare it to the minimum length needed to convey the same information to a similar audience.
MaxwellBuchanan, Dec 07 2012

       Thinking in terms of mechanical efficiency its cutting out noise, friction, reducing mass, shaping something to fit into the right hole, stuff like that. "Buy low, sell high" doesn't really need knowledge of advanced markets to make sense. Buy something when the value is low and sell it when the value is high. Extrapolating on the meanings of everything may eventually produce quite a lot of information. Value is certainly a complex concept, and by cutting out that word and implying its meaning in 'low', and 'high' then some market sense is built in. An agrarian person might think buy seeds and soil, physically low things, and sell fruit when it is high, physically high ontop of a tree or plant. But then of course those physical things have a value that is not literal. It should be confusing because even in terms of social stratification something nonliteral like stratification is interpreted as a literal hierarchy, but really the evaluation is much more complex and may only be coincidently physical like a homeless person sleeping low under a bridge, and a wealthy person high in a penthouse. More to the point about mechanical efficiency the noise isn't necessarily implicit in the language but a noise of referents that a person might use to make sense of something. If someone says "look at that car" that is a pretty simple message, but if there are many cars such as on a highway the noise comes from outside, and the message was just an exclamation and not a suggestion, but what if there is only one car and you're in a small garage then the clear message is really a senseless noise. So that is also somewhat of a point that perhaps some noise was required to reach, that jargon is perhaps not the only source of noise but confusing points of reference as well, literal and nonliteral ones.
rcarty, Dec 07 2012

       //how do you measure the efficiency of a speaker/writer if efficiency almost entirely depends on the cognition of the listener/reader? You can't.//   

       You can take a decent stab at it.   

       For example, the phrase "La plume de ma tante" contains the same information as "My aunt's pen", and relies on the same background knowledge in the reader. But the French phrase has 20 characters (including spaces) whereas the English phrase has only 12. Hence, by this limited measure, French has a footprint of 20/12, or about 1.7
MaxwellBuchanan, Dec 07 2012

       There's also "LO!", although that's sort of tonal.
nineteenthly, Dec 07 2012

nineteenthly, Dec 07 2012

       Another way to approach this would be to think in terms of how much something can be exploded and how much it can be imploded. "Buy low; sell high" when taken as meaningful words, as opposed to destroying it in a mass of meaningless referents as above, can be exploded to the definitions of each word. If someone took care and time to do this exploding of each word, for example defining buy, defining the words that define buy, defining the words that give meaning to those words etc. then quite a long treatise could be produced on rational exchange and value, and things of that nature, probably not without some difficulty using various dictionaries. That treatise could be read and someone could implode it down to "buy low; sell high" and this is something that could be tested if someone were to perform such an undertaking, or undertake such a performance.   

       Think about something known to be absurd that cannot be exploded "you only live once". Only and once are the same word, and you and live refers to the same life. Basically it can be imploded down to "once" and the explosion that takes place is in the absurd mass of "one" justifying "all". But of course it is an absurdist philosophy, where no explosion of meaning can take place, and adherents simply bask in the meaninglessness. They can't verify the claim, only the past certainty that authority was without god. Camus pour le chameau. A more meaningful person without so many humps would probably say something like "live each day to the fullest", fullest providing for a fullness of meaning, and each day one of a multiplicity each with a more meaningful fullness than the last, or "sieze the day" with a purposefulness that transcends meaning.
rcarty, Dec 07 2012

nineteenthly, Dec 08 2012

       You'll have to consider the acceleration of the JF in time, as more and more lengthy annotations are written. On the other hand, there is a threshold past which most people don't read. And with annotations, sometimes if they are too long they'll be skipped altogether.
pashute, Dec 08 2012


MaxwellBuchanan, Dec 08 2012


