qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
Date: Tue, 29 Mar 2016 12:37:15 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0


On 29/03/2016 12:03, Sergey Fedorov wrote:
>>> >> [...] I would suggest the following solution:
>>> >>  (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it
>>> >>      in cpu_exec() when deciding on whether to patch the last executed
>>> >>      TB or not
>>> >>  (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer
>>> >>      flushes; capture it before calling tb_gen_code() and compare to it
>>> >>      afterwards to check if tb_flush() has been called in between
>> > Of course that would work, but it would be slower.  
> What's going to be slower?

Checking tb->pc (or something similar) *and* tb_flush_count in 
tb_find_physical.

> > I think it is
> > unnecessary for two reasons:
> >
> > 1) There are two calls to cpu_exec_nocache.  One exits immediately with
> > "break;", the other always sets "next_tb = 0;".  Therefore it is safe in
> > both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag.
> >
> > 2) if it were broken, it would _also_ be broken before these patches
> > because cpu_exec_nocache always runs with tb_lock taken.  
> 
> I can't see how cpu_exec_nocache() always runs with tb_lock taken.

It takes the lock itself. :)

>From Fred's "tcg: protect TBContext with tb_lock", as it is in my mttcg 
branch:

@@ -194,17 +194,23 @@ static void cpu_exec_nocache(CPUState *cpu, int 
max_cycles,
     if (max_cycles > CF_COUNT_MASK)
         max_cycles = CF_COUNT_MASK;
 
+    tb_lock();
     cpu->tb_invalidated_flag = 0;
     tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
                      max_cycles | CF_NOCACHE);
     tb->orig_tb = cpu->tb_invalidated_flag ? NULL : orig_tb;
     cpu->current_tb = tb;
+    tb_unlock();
+
     /* execute the generated code */
     trace_exec_tb_nocache(tb, tb->pc);
     cpu_tb_exec(cpu, tb->tc_ptr);
+
+    tb_lock();
     cpu->current_tb = NULL;
     tb_phys_invalidate(tb, -1);
     tb_free(tb);
+    tb_unlock();
 }
 #endif
 

It takes the lock before resetting tb_invalidated_flag.

cpu_exec_nocache is not used in user-mode emulation, so it's okay if
qemu.git doesn't take the lock yet.  (This kind of misunderstanding
about which code is thread-safe is going to be common until we have
MTTCG.  This was the reason for the patch "cpu-exec: elide more icount
code if CONFIG_USER_ONLY").

> > So I think
> > documenting the assumptions is better than changing them at the same
> > time as doing other changes.
>
> I'm not sure I understand you here exactly, but if implementing my
> proposal, it'd rather be a separate patch/series, I think.

Exactly.  For the purpose of these 5 patches, I would just document above
cpu_exec_nocache that callers should ensure that next_tb is zero.

Alternatively, you could add a patch that does

    old_tb_invalidated_flag = cpu->tb_invalidated_flag;
    cpu->tb_invalidated_flag = 0;
    ...
    cpu->tb_invalidated_flag |= old_tb_invalidated_flag;

it could use the single global flag (and then it would be between patch 4
and patch 5) or it could use the CPU-specific one (and then it would be
after patch 5).

However, I think documenting the requirements is fine.

>> > Your observation that tb->pc==-1 is not necessarily safe still holds of
>> > course.  Probably the best thing is an inline that can do one of:
>> >
>> > 1) set cs_base to an invalid value (anything nonzero is enough except on
>> > x86 and SPARC; SPARC can use all-ones)
>> >
>> > 2) sets the flags to an invalid combination (x86 can use all ones)
>> >
>> > 3) sets the PC to an invalid value (no one really needs it)
>
> It's a bit tricky. Does it really worth doing so instead of using a
> separate dedicated flag? Mainly, it should cost one extra compare on TB
> look-up. I suppose it's a kind of trade-off between performance and code
> clarity.

I think a new inline function cpu_make_tb_invalid would not be too tricky.
Just setting "tb->cs_base = -1;" is pretty much obvious for all the targets
that do not use cs_base at all and for SPARC which sets it to a PC (and
thus a multiple of four).  x86 is the odd one out.

Thanks,

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]