Right, it's not the kernel, it's the bB code. The reason using too many cycles is undesirable is because it causes the overscan part of the TV frame to have too many lines, which is quite likely to cause screen jittering or rolling.
The "ideal" TV frame generated from a 2600 game is made up of these parts and these line counts:
[VSYNC.............3 lines] [VBLANK...........37 lines] [VISIBLE SCREEN..192 lines] [OVERSCAN.........30 lines]
The main part of your basic program runs in the overscan part of the TV frame until it hits a drawscreen command, at which point the rest of the frame parts are drawn. When overscan time beings again, control is returned to your basic code, just after the drawscreen command.
One way to avoid using too many cycles is to have multiple drawscreens strategically placed in CPU-heavy parts of your code. This approach won't work in the main loop of the game, as each drawscreen in your main loop will reduce the overall framerate, but it works well enough for screen setup code.
If you take this approach, you may want to set the colors of all screen elements to black until the screen is completely drawn, to avoid showing the screen as it's being built-up.
It's also worth mentioning that your basic program can also run partially in vblank with the vblank keyword, and similarly you don't want to use too many cycles there either, or else there will be too many vblank lines, which will cause jittering or rolling.