Skip to content

Conversation

awr1
Copy link
Contributor

@awr1 awr1 commented Aug 31, 2019

Unforunately pointer arithmetic keeps coming up every so often which kinda suggests "maybe we should have this as a stdlib module." Although I am not a huge fan of "C-like functionality that is often more trouble than its worth" 😐

@juancarlospaco
Copy link
Collaborator

I think those can be template or {.inline.} ❔ So tiny. 🤔

## arithmetic is sometimes desirable for low-level ops/FFI with C.

func `+`*[T](x: ptr[T]; offset: SomeInteger): ptr[T] =
cast[ptr[T]](cast[ByteAddress](x) + cast[ByteAddress](offset * sizeof(T)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the + should be a +%, do pointer wraparound like unsigned int? or is it undefined behaviour like signed integers?

Copy link
Contributor Author

@awr1 awr1 Sep 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume undefined. But yes I like +% better as a less accessible operator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems better to use offset as a function name here - you want to be able to look for where these operations are used instead of hiding them behind an overload

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also: https://doc.rust-lang.org/std/primitive.pointer.html - it has docs, examples, risks, etc that hint at how compilers think about this - the names are pretty standard too meaning it's easier to cross-reference with other languages / port code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually meant this +% for the ByteAddress summation but +% for the op name sounds good as well: https://nim-lang.org/docs/system.html#%2B%25%2Cint64%2Cint64

I feel like offset is too much. Libraries that import this probably have lots of pointer arithmetics to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use +! for pointer additions, the ! screams "dangerous".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+! at least is unique and thus possible to grep for.

I don't really see the argument that you use pointer operations "a lot" - in particular, you typically don't stack multiple pointer ops in a single expression so even with a name like offset (that's easier to google for than %!), you won't end up with excessively long lines because there will only be one of them

If you have lots and lots of pointer arithmetic to do, perhaps that's a sign that you should wrap it in something safer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's for writing the wrapper ;)

func `+`*[T](x: ptr[T]; offset: SomeInteger): ptr[T] =
cast[ptr[T]](cast[ByteAddress](x) + cast[ByteAddress](offset * sizeof(T)))

func `-`*[T](x: ptr[T]; offset: SomeInteger): ptr[T] =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a template.
I expect those are used in very tight loops.

I dare say emitting assembly might also be considered if somehow compilers cannot generate optimized code.

Copy link
Collaborator

@juancarlospaco juancarlospaco Sep 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree they should all be templates.
Adding a single line documentation comment would be cool too
(I know is obvious what they do, but new people will not understand).

#

## This module offers operators for performing arithmetic on raw pointer types
## in Nim. Nim is *not* C: using pointer arithmetic in Nim should be avoided (if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, and because of this I don't think we should be making pointer arithmetic easier by adding a module like this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add the same logic as documentation explaining how can be done,
what are the problems doing so can bring on your code (pros/cons, explanation),
but not really adding it as a new std lib module ❔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have lenientops, bitops, endians already for people to spend happy time debugging.

We can add a big bold Pointer arithmetics are very sharp tools. This module is unsafe, comes without guarantees and might cause you to spend endless hours in a debugger, eat your cat or cause the heat death of the universe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the idea @juancarlospaco has, maybe as an addition to the Nim tutorial

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said the reason I thought I thought it should be a module was because it came up frequently enough (in the forums) for people to want something like this in full knowledge of its foot-shooting properties

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add it to the Nim Manual, Not the Tutorial, noobs should not start with Pointer Arithmethics for a Hello World. 😆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally don't see warnings that don't say why something is unsafe as terribly useful - they're not going to stop anyone from using the module. the module does come with guarantees: it will offset pointers, nothing more.

The offset operation itself is harmless.

It is accessing memory through these new pointers that is unsafe, because the new location might be uninitialized, aliased or unaligned. The aliasing in particular brings several interesting issues to the table, because it opens the floodgates to a host of aliasing optimization bugs that generally are only visible when compiling in release mode, meaning crashes that start when you ship the product.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning crashes that start when you ship the product.

Btw, you can test your software in -d:release mode too. ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, we're talking about risks, and pointer aliasing is a clear and poorly understood risk, judging from the number of stack overflow posts about it, and it's specially devious because it only happens when optimizations are turned on. using this module without understanding them seems bad, and not for the cat - adding the module to the std lib without understanding them seems even worse.

@@ -0,0 +1,25 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about calling it ptrops similar to bitops?

@awr1 awr1 changed the title Added ptrarith to stdlib Added ptrops to stdlib Sep 3, 2019
@arnetheduck
Copy link
Contributor

generally, ByteAddress is an int which means it will raise range errors. depending on the address space this will raise on arithmetic, specially on 32-bit where stuff like kernel options might affect the absolute values - see for example how linux uses the address space right around the 2gb mark (which is where spurious raises start to happen) depending on kernel options: https://en.wikipedia.org/wiki/Virtual_address_space#Linux

I'd recommend against merging this, and thinking about how and if ByteAddress makes sense at all, in general, as well as conducting a more thorough investigation on pointers and risks involved before adding anything like this to the std lib.

@awr1
Copy link
Contributor Author

awr1 commented Sep 3, 2019

@arnetheduck ByteAddress is just a type alias to int for "readability" reasons, IIRC int (Nim is different from C in this respect) is always guaranteed to be equal to pointer size on the system.

image

Perhaps ByteAddress should be deprecated because of its redundancy, but I don't know.

@arnetheduck
Copy link
Contributor

@awr1 yes, but consider what will happen with your patch on a 32-bit machine when you have a pointer-to-byte pointing to 0x7fffffff (a valid address) and add one to it? keep in mind that + for int raises on overflow.

## in Nim. Nim is *not* C: using pointer arithmetic in Nim should be avoided (if
## possible, use `array[T]`, `ptr UncheckedArray[T]`, etc.), however, pointer
## arithmetic is sometimes desirable for low-level ops/FFI with C. **Pointer
## arithmetic is a very sharp tool. This module is unsafe, comes without
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## arithmetic is a very sharp tool. This module is unsafe, comes without
##
## Risks
## -----
##
## - Pointer Arithmetic is a very sharp tool. This module is unsafe.
## - Nim is *not* C: Using Pointer Arithmetic in Nim should be avoided.
## - Other reason why it can be problematic.
## - Other reason why it can be problematic.


template `+%`*[T](x: ptr[T]; offset: SomeInteger): ptr[T] =
## **Unsafe.** Adds `offset * sizeof(T)` bytes from the pointer `x`.
cast[ptr[T]](cast[ByteAddress](x) + cast[ByteAddress](offset * sizeof(T)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use +% here as well to avoid int range checks
https://nim-lang.org/docs/system.html#%2B%25%2CIntMax32%2CIntMax32


template `-%`*[T](x: ptr[T]; offset: SomeInteger): ptr[T] =
## **Unsafe.** Subtracts `offset * sizeof(T)` bytes from the pointer `x`.
cast[ptr[T]](cast[ByteAddress](x) - cast[ByteAddress](offset * sizeof(T)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@juancarlospaco
Copy link
Collaborator

Add a {.warning: "Some warning about this is unsafe".}
Thats one of the reasons the warning pragma exists anyway. 🤔

@Araq
Copy link
Member

Araq commented Sep 4, 2019

Sorry, rejected. Pointer arithmetic isn't that hard to do in Nim and people who need to do it can already figure out how to do it.

@krux02
Copy link
Contributor

krux02 commented Sep 4, 2019

I agree with Araq here. Pointer arithmetic already works when you cast to ptr UncheckedArray[T] instead of just ptr T.

@awr1 awr1 closed this Sep 5, 2019
@arnetheduck
Copy link
Contributor

we've added something similar to stew where we'll play around with it: https://github.com/status-im/nim-stew/blob/master/stew/ptrops.nim

but yeah, longer term it's usually better to use something else

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants