Why do array indices start at zero?

a thoughtful web.

commentspostsbadges

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

Why do array indices start at zero? · 10

user-inactivated · 3811 days ago

tive.org · #programming · #programming.minimum_wage · #history

tweet · htmlmarkup tips · 0

briandmyers · 3810 days ago · link ·

Because modulo arithmetic. Duh.

+discuss+discuss

rob05c · 3811 days ago · link ·

That's not why array indices start at zero. That is an astounding display of arrogance and ignorance.

This is why indices start at zero.

+discuss+discuss

–

user-inactivated · 3810 days ago · link ·

I would think direct word from the designer of the first ever 0-indexed programming language that all other 0-indexed languages descend from should carry more weight than some post-hoc justification from the 80s

+discuss+discuss

–

rob05c · 3810 days ago · link ·

Ok, I’ll bite.

I would think the direct word of Edsger Dijkstra should carry more weight than some arbitrary decision of an obscure language designer and his unknown fanboy blogger.

if your answer started with “because in C…”

It didn’t. He keeps poor company.

the technical reason we started counting arrays at zero

Why we started doesn’t matter; only why we continued.

The usual arguments involving pointer arithmetic and incrementing by sizeof(struct) and so forth describe features that are nice enough once you’ve got the hang of them, but they’re also post-facto justifications.

The “post-facto” claim is unprovable and irrefutable. He could as easily claim the primality of 1 is a “historical artifact” and 1 ought to be composite. Unfortunately for him, the existence of linear time does not make mathematical axioms “historical artifacts” rather than products of reason. There are reasons 1 is not prime, and there are reasons numbering and indexing should begin with 0.

I find it hard to believe anyone can write any amount of pointer arithmetic in both one- and zero-based indexing languages, and not quickly recognize zero-based indexing as superior. I’m speaking from experience. It seems more likely he has seldom written any kind of pointer arithmetic. Wikipedia does a good job of empirically describing why zero-based indexing is mathematically simpler.

to a typical human thinking about the zeroth element of an array doesn’t make any more sense than trying to catch the zeroth bus

What is right is often not intuitive. Case in point: the Monty Hall Problem.

we’re a lot more superstitious

We tell and retell

His "we" doesn't describe any programmers I know. It kinda sounds like he's surrounded himself with imbeciles and decided himself wise.

there are dozens of other arguments for zero-indexing involving “natural numbers” or “elegance” or some other unresearched hippie voodoo nonsense that are either wrong or too dumb to rise to the level of wrong.

“Too dumb to rise to the level of wrong?” Edsger Dijkstra? Really?

approaches we call part of “modern” programming if we attempt them at all, sitting abandoned in 45-year-old demo code

This is only really true of Java (to use a wide, ugly brush). No one with any kind of programming language theory, or who's ever used LISP, thinks these things are new. There are new developments in programming language theory. See: Haskell, Rust.

Let me address a concrete example.

How many off-by-one disasters could we have avoided if the “foreach” construct that existed in BCPL had made it into C?

C doesn't have foreach because it requires keeping track of a container's size. You have to know where to stop. C arrays don't have a length, unless they're declared on the stack, and even then only if you have the original declaration (i.e. haven't passed it to a function). C is a performant language, and including that length is unacceptable. Additionally, what if I say "int b[5];int* a = b+1;". What's the size of a? Are you going to do another add to track the length? Or perhaps you’d rather dynamically subtract the length in the foreach, which means you need a pointer to the original. Pointer arithmetic just got way more expensive. Good luck deallocating those too.

By the way, foreach macros for static arrays are trivial.

wondering if we can get away from Von Neumann architecture

Backus' article isn’t talking about alternative architectures, it’s talking about higher-level software abstractions. But I’ll give him the benefit of the doubt, and assume it was a typo. That article was written in 1978. LISP was invented in 1958. Lambda Calculus was invented in 1936. Those higher-level abstractions have always been around (and Backus knew it). Programmers like him simply aren’t willing to ascend the learning curve. In fact, LISP is the source of most of the “innovations” programmers like him who have only ever written ALGOL derivatives are surprised to find “in 45-year-old demo code.”

I think it’s clear he has little understanding of basic mathematics, computer science theory, language theory, or the recent progress made in language theory. He abounds in irrefutable opinions (most of which are categorically wrong, to my anecdotal experience), but nearly every concrete fact he states is wrong.

In this particular instance, he found a bit of historical trivial, decided existing practice is entirely ignorant, and decried all other programmers as,

Whatever programmers think about themselves and these towering logic-engines we’ve erected, we’re a lot more superstitious than we realize. We tell and retell this collection of unsourced, inaccurate stories about the nature of the world without ever doing the research ourselves, and there’s no other word for that but “mythology”. Worse, by obscuring the technical and social conditions that led humans to make these technical and social decisions, by talking about the nature of computing as we find it today as though it’s an inevitable consequence of an immutable set of physical laws, we’re effectively denying any responsibility for how we got here. And worse than that, by refusing to dig into our history and understand the social and technical motivations for those choices, by steadfastly refusing to investigate the difference between a motive and a justification, we’re disavowing any agency we might have over the shape of the future. We just keep mouthing platitudes and pretending the way things are is nobody’s fault, and the more history you learn and the more you look at the sad state of modern computing the the more pathetic and irresponsible that sounds.

No, it’s really just him.

the more history you learn

He should stop learning history, and start learning Haskell, Common Lisp, Rust, Fortran, and C. Then have a discussion about language theory.

+discuss+discuss

–

coffeesp00ns · 3810 days ago · link ·

This is only tangentially related, but Musical Set Theory uses from 0-11 to describe the 12 individual semitones of standard western harmony. I've always run under the assumption that they use that system because it makes the math easier when you start to deal with symmetrical intervals (Unison and an octave being the same, 4ths and 5ths being the same).

Because the symmetrical intervals are treated the same, but you are often looking for the smallest possible intervals, you can find the smaller interval by removing the interval you have from 9 in Standard harmony, and 12 in Set theory. In traditional music theory, this means P4=P5, Maj3 =Min6, Min3 = Maj6. In Set theory this means that everything mirrors in 0-6, so 0=12, 1=11,2=10 etc. up to 6=6, the mirroring at the tritone. If you tried to start it on 1 instead of 0, that mirroring doesn't work out. 1=13,2=12, 3=10, 4=9, 5=8, 6=7? Oops.

Just more proof that starting with 0 is easier, i guess.

+discuss+discuss

user-inactivated · 3802 days ago · link ·

Really?

If you read the comments below the article, he addresses somebody who said the exact same thing:

Dijkstra’s a very smart, eminently respectable fellow, and I have no quarrel with him. People citing an only-tangentially-related paper written by Dijkstra twenty years after the fact is where the “too dumb to rise to the level of wrong” part comes into play.

+discuss+discuss

mike · 3810 days ago · link ·

Great analysis. From a math standpoint, it is almost always easiest to start counting at zero. Formulas become a lot easier, and from my programming experience code is almost always (not always always) easier when you start the count at zero. It changes depending on how you want to think about a task. Seems the author is looking for a conspiracy where there isn't. Meh.

+discuss+discuss

rob05c · 3810 days ago · link ·

int b[5];int* a = b+1;

Incidentally, this is one of the reasons Fortran can potentially be faster than C.

C guarantees correctness for pointer aliasing, which limits the optimisations the compiler can do. Fortran makes no such guarantees.

The C99 restrict keyword obviates this.

I was shocked the first time I saw a simple matrix Fortran program outperform an identical C program.

+discuss+discuss

CrazyEyeJoe · 3810 days ago · link ·

Frankly, I was entirely unconvinced by his interpretation of Martin Richards' comment. Richards plainly says

[...]

Just as machine code allows address arithmetic so does BCPL, so if p is a pointer p+1 is a pointer to the next word after the one p points to. Naturally p+0 has the same value as p.

[...]

I can see no sensible reason why the first element of a BCPL array should have subscript one.

He's talking about pointer arithmetic! Which is exactly what the author of the article is saying was NOT the reason to choose 0-indexing...? How does that even make sense?

+discuss+discuss

user-inactivated · 3811 days ago · link ·

Is there anyone in the industry who both would read research papers and isn't a member of the ACM and IEEE? A couple of hundred dollars a year is unreasonable if you're a hobbyist or a student (but in the latter case you have your university library), but programming pays well enough that it's not that expensive if you're a working programmer. Open access is great, but I don't buy membership dues being as much of a barrier as he seems to think they are.