Skip to content

Conversation

@bzfbd
Copy link
Contributor

@bzfbd bzfbd commented Jun 2, 2020

There are cases when node_alloc() and node_free() are called but
the node_alloc_cd() has not run yet. This can lead to node_free()
freeing the *ni before the node_alloc_cb() runs which then modifies
memory (in this case the "is_in_peer_table") after free (in
addition to inserting state which is wrong).

To address this do multiple things:

  • change the is_in_peer_table from int to bool as that is what it is
    and use a macro and ath10k_dbg() to track state changes. This was
    mostly for debugging.
  • While here move athp_node_alloc_state into if_athp_main.c as it is
    not used anywhere outside the function. Helps understanding the
    code.
  • Store the allocated taskq entry in node_alloc() in "alloc_taskq_e"
    so we can check it during node_free() and cancel the callback if
    needed.
  • Rework parts of the athp_taskq to allow us to reliably cancel
    the entry: ad d a second list where we put the entries we are
    executing. Walking that list can be done lock-less. Add a
    athp_taskq_cancel() function which will either take the entry
    of the taskq or wait that no entries are run before returning.
    While there, remove extra () around the locking macros, remove
    an early (extra) on_queue = 1 in athp_taskq_queue() and fold
    some print lines into less vertical space.

Fixes Issue #28.

Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")

Bjoern A. Zeeb added 2 commits June 2, 2020 10:50
the node_alloc_cd() has not run yet.  This can lead to node_free()
freeing the *ni before the node_alloc_cb() runs which then modifies
memory (in this case the "is_in_peer_table") after free (in
addition to inserting state which is wrong).

To address this do multiple things:
- change the is_in_peer_table from int to bool as that is what it is
  and use a macro and ath10k_dbg() to track state changes.  This was
  mostly for debugging.
- While here move athp_node_alloc_state into if_athp_main.c as it is
  not used anywhere outside the function.  Helps understanding the
  code.
- Store the allocated taskq entry in node_alloc() in "alloc_taskq_e"
  so we can check it during node_free() and cancel the callback if
  needed.
- Rework parts of the athp_taskq to allow us to reliably cancel
  the entry: ad d a second list where we put the entries we are
  executing.  Walking that list can be done lock-less.  Add a
  athp_taskq_cancel() function which will either take the entry
  of the taskq or wait that no entries are run before returning.
  While there, remove extra () around the locking macros, remove
  an early (extra) on_queue = 1 in athp_taskq_queue() and fold
  some print lines into less vertical space.

Fixes Issue erikarn#28.

Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
@bzfbd
Copy link
Contributor Author

bzfbd commented Jun 4, 2020

Should also compile now after adding the missing '"'.

@erikarn
Copy link
Owner

erikarn commented Jun 11, 2020

I'm going to try splitting up the node allocate and the node initialisation bits in net80211 to try and aide in this. At least then we can take a reference on the node.

I still think we'll need some flavour of what you're doing because of how the temporary node stuff is implemented (ie they're immediately deleted!) and how that could lead to some fun stuff in back to back node creation/deletion pulling the peer out of the firmware at the wrong time. It turns out if you screw THAT up then the firmware transmit path hangs...

@bzfbd
Copy link
Contributor Author

bzfbd commented Jun 12, 2020

I'd suggest applying it first and then separating it out and not doing it both in one go. I am not quite there yet to do all the net80211 direct handling. If we are going to specific "linux-style" calls from net80211 into the driver I think it would be advantageous to keep them as similar in name, arguments, and behaviour as we can, as all that will mean less fiddling when porting drivers.

@erikarn erikarn force-pushed the master branch 2 times, most recently from c55dc6b to f1c00bf Compare July 6, 2025 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants