cache_coherent_ is what I'm going with for now based off of studying it further.
I really really don't like the "atomic" as the verbiage phrase. It conveys nothing about what the execution engine is actually doing with the thread caches or the bus snoop.
Planning to try out a flavor of Ryan's multi-threaded laned procs with the some extra threads hooked up separately to a job system.
Will most likely do 2 threads main/helper on live lanes, then 2 others on job queue loops