Hi,
I am writing my own real-time kernel for x86. Now I face something
really strange (or may be rather it's not; it has been some time since
I was in the details of x86 microarchitecture).
I measured CPU clocks elapsed between the first assembly instruction
executed at interrupt's entry point in IDT and beginning of the C code
of user-defined interrupt handler and the result was a big
surprise :-) It took about 2500 cycles despite that I have only a
handful of assembly instructions before a call to user-supplied IRQ
handler.
A little more testing showed that almost all cycles (2300+) were spent
at access to a global variable (via ds:[] addressing). Nothing that
accesses stack memory (push, pop, call, mov) makes a noticeable
difference. Does anybody have an idea why this happens? I test on
Celeron 2.8G in protected mode set up for flat model with paging
disabled.
I can post the code of the interrupt's entry point (until the C entry
point is called), but it's rather trivial and not long.
Thanks,
D