Can't decide if end user index access should be 0 or 1 based and if END index should be inclusive

I'm currently writing a CLI tool that handles a specific JSON data format. And I also want to give the user to get a slice of the item array of the file. It's a slice in form of --slice START:END through commandline options. So in example --slice 1:2.

Should I provide a 0 based index for the access or a 1 based index? In example --slice 1:2 with 0 based index would start with the second element and with 1 based index it would start with the first element.
And would you think its better to have the END to be inclusive or exclusive? In example --slice 1:2 would get only one element if its exclusive or it gets two elements if its inclusive.

I know this is all personal taste, but I'm currently just torn between all options and cannot decide. And thought to ask you what you think. Maybe that helps me sorting my own thoughts a bit. Thanks in advance.

View original on beehaw.org

Comments8

bleistift2

sopuli.xyz

Anybody capable of using a CLI knows that the right answer is:

index from 0
end is exclusive.

Dijkstra points out why: https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831.html

limer reply

lemmy.ml

I agree with that other comment which argues to set it as the users expect. I think the 1 based is logical here

atomic peach

pawb.social

I think it would depend on the typical user base and how the rest of the cli operates. If it's typical array work or your users are typically programmers or otherwise know computing, then stick to 0 based indexing. If they're users of spreadsheets and rarely interface with zero-based indicies, then stick to what they know. Just document it well enough for everyone!

I'd also think inclusive is more intuitive. If they only want one element, then they can provide the single element, otherwise they get the full range.

Although, if your cli is trying to mimic another programming function. If it's very clear that's the intent, then follow the functionality of the parent function.

K2yfi

programming.dev

I've been working on this problem for my own language, and have landed on something more clear than just following a convention. Basically you use [] and () to specify if the left and right bounds are included or not (based off of interval notation: https://en.wikipedia.org/wiki/Interval_(mathematics)#Including_or_excluding_endpoints). e.g. for your case

--slice [1:5)    # include the left index. don't include the right index
--slice [1:5]    # include both left and right index
--slice (1:5]    # don't include the left index. include the right index
--slice (1:5)    # don't include the left or right index

potentially not relevant to your case, but my version supports an end keyword which you can do math on, similar to python's negative indexing

[2:end-3]    # start at index 2 (included) and go through till the third from last index (included)
(end-3:end]  # start at the third from last (excluded) and go to the end (included)

Personally I'm a fan of 0 indexing, but for your context, I think it would depend on how the user sees what they're slicing. E.g. if it was pages with page numbers, the numbers would indicate if it was 0 or 1 index based. If there's nothing to actually show the user, I think picking something reasonable and documenting it well is probably the best bet.

Miaou reply

jlai.lu

That's still following a convention

Life is Tetris

leminal.space

Ruby/Crystal seem to have P .. Q for inclusive ranges and P ... Q for right-exclusive ranges.

bleistift2

sopuli.xyz

You’re writing a CLI tool to handle JSON data. Just making sure: You know jq exists, right?

1Fuji2Taka3Nasubi

lemmy.zip

I know some programming languages use : for ranges and it is more legible if you support negative indices, but I think START-END is more natural reading and I’d use : for START:COUNT instead, e.g. 3:4 for 4 elements starting from 3, so elements 3,4,5,6 or 3-6.

You can even support both formats! (Feature creep warning)

-2