IRC logs for #openrisc Friday, 2014-07-11

--- Log opened Fri Jul 11 00:00:14 2014
-!- Netsplit .net <-> .split quits: jonmasters, rah, heroux, Amadiro, ysionneau, xlro, fotis2, zama, arokux, olofk_, (+21 more, use /NETSPLIT to show all of them)		01:27
-!- Netsplit over, joins: LoneTech, jeremy_bennett, rah, rokka, xlro, trevorman, olofk_, simoncook, FreezingCold, stekern (+21 more)		02:08
olofk_	maxpaln: Hi! Finally caught up with you :)	13:51
maxpaln	Hi	14:16
maxpaln	olofk_: How can I help?	14:17
olofk_	maxpaln: I wanted to invite you to the yearly OpenRISC conference, but I realized that I don't have any contact information	14:20
maxpaln	aha - you can send it via email to [email protected]	14:20
maxpaln	is email good enough?	14:20
olofk_	Perfect. Thank you	14:21
maxpaln	great - when is it?	14:21
maxpaln	or perhaps I should wait for the invite [the suspense is too much :-)]	14:21
olofk_	HAha	14:21
olofk_	October 11-12 in Munich	14:22
maxpaln	Great - I look forward to it. I will almost certainly be there.	14:22
olofk_	Excellent! Happy to hear	14:22
maxpaln	Bad news BTW - it is my bug :-( I was hoping to pass this onto someone else!!!!	14:24
olofk_	Ahh.. the most common kind of bug unfortunately :(	14:26
maxpaln	yeah, its almost as if logic simulation isn't as thorough as running in HW! :-) Oh, well - this is definitely a job for next week. Have a good weekend all!	14:27
olofk_	You too!	14:28
stekern	dalias: I've found out what caused the pthread_cancel failure.	17:00
stekern	actually, there was two things	17:00
stekern	the first is a gcc bug causing this code: http://git.musl-libc.org/cgit/musl/tree/src/stdio/__lockfile.c#n9	17:00
stekern	to be turned into this: http://pastie.org/9378579#29-36	17:00
stekern	notice that it just loops depending on r14	17:01
stekern	I'll need to dig into why gcc does that, instead of this ugly local work-around I have: http://pastie.org/9378635	17:03
stekern	the second problem has to do with how our linux port handles th sig return syscall, it's checking for pending signals, so it would just loop around in the cancel_handler signal handler (since the cancel signal always was pending)	17:04
-!- Netsplit .net <-> .split quits: _franck_, ams, heroux, ssvb		17:41
-!- heroux_ is now known as heroux		17:41
dalias	stekern, it needs to check against the correct signal mask	18:03
dalias	musl masks the signal in the signal handler via modifying the ucontext_t	18:04
dalias	so maybe your definition of the ucontext_t is wrong	18:04
dalias	or maybe something is ignoring this signal mask	18:04
dalias	as for the __lockfile loop, my guess is that you have a wrong or missing constraint in the asm	18:05
dalias	in atomic.h	18:05
dalias	but i don't see any obvious mistakes	18:06
dalias	oh...	18:09
dalias	yes it's buggy	18:10
dalias	you can't use "r"(p)	18:10
dalias	you have to use "m"(*p)	18:10
dalias	and it should be output+input i think	18:11
dalias	alternatively volatile asm can be used; i think that's also safe	18:11
dalias	but accessing "m"(*p) is more semantically correct	18:11
stekern	oh, I remembered that I would have had a volatile there, but I don't	18:20
stekern	dalias: I'm not sure I follow what you mean, doesn't the signal only get masked once you've hit a cancelation point?	18:26
dalias	the signal handler re-raises its own signal, but also sets the bit in the saved signal mask in the ucontext_t	18:27
dalias	this is necessary for proper behavior with nested signal handlers	18:27
dalias	e.g. if the main flow of execution is at a cancellation point and interrupted by a signal handler, then the cancellation signal interrupts the signal handler	18:27
dalias	cancellation cannot be acted on immediately (unless the signal handler is also at a cancellation point)	18:28
dalias	but it needs to be acted on when the signal handler returns to the main flow of execution	18:28
ysionneau	dalias: how can you say that p is input and output?	18:29
ysionneau	you put it twice?	18:29
dalias	there's a form for input+output, i think it's a + sign or something	18:30
dalias	i forget the constraint syntax but it's inthe gcc manual	18:30
dalias	however as long as it's volatile having it as both input and output is non-essential	18:30
dalias	output-only seems to work on other archs	18:31
dalias	but input+output would be preferable semantically	18:31
ysionneau	in gcc manual I see an example with a variable being output and input	18:31
ysionneau	it seems they just put it twice	18:31
ysionneau	the "m" constraint is for memory addresses, right? not registers	18:33
ysionneau	I think the l.swa and l.lwa use register and not addresses so I don't understand why you say he should use "m"(*p) ?	18:34
dalias	they take memory addresses	18:37
dalias	of an object in memory that they act upon	18:37
dalias	i'm not sure what the rules are for "m" expressions on or1k	18:37
ysionneau	so this is supposed to be semantic and not syntaxic ?	18:38
ysionneau	I mean, syntaxically, it's not a memory address but a register that the opcode is operating on	18:38
dalias	if "m" allows some advanced addressing expressions that aren't valid for l.swa and l.lwa, there should be a separate arch-specific "m" variant that guarantees the address will come in a single register	18:38
ysionneau	but indeed semantically it's operating on a memory address	18:38
dalias	if all memory address expressions are single registers, "m"(*p) and "r"(p) should be equivalent except that the former tells gcc that an object at the address is being accessed, rather than the value of the pointer being used purely as a value	18:39
dalias	this affects the types of transformations (moving and duplicating or deduplicating the asm) are valid	18:40
dalias	erm that sentence was ungrammatical, but hopefully it made sense	18:41
ysionneau	which are valid <= ?	18:45
ysionneau	I never used the "m" constraint so far, won't the "*p" be replaced by the value at address pointed to by "p" ?	18:49
ysionneau	I mean, what's the purpose of the '*' here?	18:51
dalias	no	19:13
dalias	"r"(*p) would load the value at address p into a register and pass it to the asm	19:13
dalias	"m"(p) results in the asm argument being an address expression that refers to the object p	19:14
dalias	for x86 this can include advanced address expressions like 16(%eax,%ecx,4)	19:14
dalias	i'm not sure what it can include for or1k	19:14
dalias	basically the %n corresponding to "m" operands should expand to a string that's valid for the source/dest for a load/store instruction	19:16
dalias	(or, on archs like x86 where other instructions can access memory directly, those instructions too)	19:16
ysionneau	dalias: ok thanks for the explanation :)	19:23
ysionneau	dalias: so if I understand correctly, if you have "int a;" you can just put "m"(a) and you don't need any &a at all, right?	19:24
dalias	right	19:24
ysionneau	whereas with "r" you would need "r"(&a)	19:24
dalias	if you wanted the address, right	19:24
ysionneau	ok, thanks :)	19:24
dalias	anyway "r"(&a) would likely be broken	19:24
ysionneau	yep as I understand	19:24
dalias	since the compile might move loads/stores to the object across the asm	19:24
dalias	making the asm volatile and adding a "memory" clobber _probably_ avoids this	19:25
ysionneau	hummm there is still something I don't get	19:25
dalias	but imo using "m" is a nicer way of representing the fact that the asm accesses the object	19:25
ysionneau	since "r"(&a) puts the address of a into a register	19:25
ysionneau	and since this address anyway cannot change (only the value can change)	19:26
ysionneau	I don't get how it can be broken	19:26
dalias	example	19:26
dalias	int a, b;	19:26
dalias	a = 42;	19:26
stekern	dalias: fwiw, the powerpc port has probably the same problem. (I probably used that as a template)	19:27
dalias	__asm__ ("load %0, %1" : "=r"(b) : "r"(&a));	19:27
dalias	printf("%d\n", b);	19:27
dalias	stekern, ok thanks i'll check it out	19:27
dalias	ysionneau, in that example, the compiler has no reason it can't move the store a=42; across the asm	19:28
dalias	in fact it can possibly even eliminate it entirely	19:28
ysionneau	ok, but if you put volatile	19:28
ysionneau	it's forbidden, right?	19:28
dalias	i think so, but i don't really understand the semantics of volatile asm	19:28
ysionneau	to me volatile means there can be no optimization tresspassing the barrier of the volatile block	19:28
ysionneau	ah maybe I'm wrong here	19:30
ysionneau	https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile	19:30
ysionneau	dalias: so I'm safe using "r"(&a) only if I manipulate it as an address and never dereferencing it	19:30
ysionneau	for instance I can add an offset to it, and store it to another pointer etc	19:31
dalias	ysionneau, yeah, i think so. this is very confusing imo, and i'm not sure why glibc did it that way...	19:31
ysionneau	ok, here I stop bothering you with that ;)	19:32
ysionneau	thanks again	19:32
dalias	imo having a "memory" clobber and access to addresses should make gcc treat the asm as potentially accessing the memory...	19:32
stekern	and just for the record, both volatile and "m"(*p) works	19:32
dalias	but it's not defined that way, or else this is just a big gcc bug that's been around for a long time on many archs... :/	19:32
dalias	stekern, is it possible to use "+m" or whatever the notation for "input+output" is ?	19:33
stekern	it's not complaining about it at least	19:34
dalias	:)	19:34
dalias	we should definitely check out the ppc case. it's likely broken too on newer gcc at least	19:35
stekern	but the m without the + was already enough to make the bug in __lockfile go away, so I can't say if it has any extra effect ;)	19:35
dalias	since it's volatile i think the + is mostly extraneous	19:36
dalias	but i'd rather be explicit	19:36
stekern	you mean since p is volatile? yeah, maybe	19:36
dalias	nod	19:40
dalias	one reason i'd rather not rely on that though is that i'm not clear on the meaning of volatile-qualified pointers when the underlying object is not volatile	19:41
dalias	e.g. if i do	19:41
dalias	int x; volatile int p = &x; ... use p ...	19:41
dalias	does the compiler have to make accesses to *p as volatile? or can it use the fact that it knows that the pointed-to object is non-volatile to optimize?	19:42
dalias	imo it's a matter of whether the qualification of the lvalue through which the object is accessed is what matters....	19:42
dalias	...or whether the qualification of p as pointer-to-volatile just means that it's allowed to point to either volatile or non-volatile objects	19:43
dalias	the latter is how const-qualified pointers work, which is what makes me suspect the same may be true for volatile	19:43
dalias	but i've never found an authoritative answer one way or the other on the matter	19:44
stekern	yeah, I'm not sure neither	19:54
stekern	not that my answer would have been anywhere near to autoritative neither ;)	19:57
dalias	:)	20:01
dalias	so did all the failures go away now? :)	20:02
stekern	I need to read what you said in the backlog and see if I have misunderstood something about the cancellation signal, but I patched the kernel to not just check for pending signals in the sig_return handler and pthread_cancel passes with that	20:04
stekern	so I'm at least on the right track there.	20:04
stekern	when I'm done with that, there are some IPC failures as well (and I haven't ran all tests yet, so there might be more to play with after that)	20:05
dalias	stekern, did you check if ucontext_t is defined right?	20:09
dalias	i don't think your kernel patch is right, but it's possible there's a kernel bug still	20:09
dalias	can you point me to the relevant kernel code?	20:10
stekern	yeah, I know my kernel patch isn't right, I just did that to check my theory	20:11
dalias	nod	20:12
dalias	signal.c has if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))	20:13
dalias	which looks right	20:13
dalias	so i wonder if musl's idea of the address is wrong	20:13
dalias	ok it's possibly a kernel bug or possibly just a clash in interpretation of types	20:14
stekern	let's investigate that	20:14
dalias	mcontext_t (sigcontext) has an oldmask member	20:14
dalias	and there's also the sigset_t in the containing ucontext_t object	20:15
dalias	maybe oldmask is something else other than a signal mask?	20:15
dalias	it's only 32 bits so it can't be a signal mask	20:15
dalias	oldmask seems entirely unused	20:19
dalias	so i doubt that's the issue	20:19
dalias	what you might do is set a weird signal mask (e.g. put some ascii text in it ;-)	20:20
dalias	and then check the uc_sigmask from a signal handler to see if the mask matches what you expect	20:20
dalias	all i can figure is that maybe some type is defined wrong such that the uc_sigmask offset in ucontext_t is wrong in musl	20:20
stekern	yes, and I understand how cancel_handler works now (at last too) ;)	20:23
stekern	I'll take a deeper look at what you suggested	20:24
stekern	btw, another question our CANCEL_REG_IP will point at the instruction after the l.sys instruction, won't that break this? https://github.com/skristiansson/musl-or1k/blob/master/src/thread/cancel_impl.c#L49	20:25
stekern	just adding a l.nop after the l.sys would fix that if that's the case I guess	20:26
dalias	sigh i found the problem	20:28
dalias	the uapi headers are wrong and don't match the actual api the kernel uses...	20:28
dalias	at least that seems to be the case... just a sec	20:28
dalias	maybe not	20:29
stekern	https://github.com/skristiansson/musl-or1k/blob/master/src/thread/or1k/syscall_cp.s#L16 <- the l.sys I'm speaking about	20:32
dalias	but the real pt_regs and the struct ptrace.h exposes to userspace don't match	20:33
dalias	hm, what's your concern?	20:33
stekern	that the syscall will not be treated as a cancellation point, since the pc that the cancel_handler sees will be outside cp_begin/cp_end	20:34
stekern	(this is completely unrelated to the sigmask discussion, just to be clear ;))	20:36
dalias	under what conditions?	20:37
dalias	if the syscall has completed (or will return with EINTR), the signal handler _should_ see a pc outside the range	20:37
dalias	it should only see a pc in the range if the syscall is going to resume after the signal handler returns	20:38
dalias	(or if the syscall wasn't even started yet -- there's a tiny window of possibility for that too)	20:38
stekern	yes, but if I understand things correctly, it will not work in the "syscall is going to resume after the signal handler returns" case	20:39
dalias	why not?	20:39
dalias	in that case pc should point to the syscall instruction	20:40
stekern	since the context store upon syscall entry will store the l.sys+4 address	20:40
blueCmd_	why hasn't someone created an AXI4-Lite <-> Wishbone module?	20:40
dalias	that should be decremented if the syscall is to be restarted	20:40
stekern	aaaaah! of course	20:41
dalias	how else would the kernel store the knowledge that the syscall needs to be restarted?	20:41
dalias	:)	20:41
dalias	hmm in entry.S, the kernel loads the stack pointer as the arg to _sys_rt_sigreturn	20:42
dalias	which is treated as a pointer to pt_regs	20:42
stekern	mmm	20:46
dalias	but as far as i can tell that's wrong...	20:47
dalias	sp seems to point to an rt_sigframe, not pt_regs	20:48
dalias	but maybe i'm missing something...	20:49
stekern	yes, that looks odd...	20:52
stekern	or does it? r1 that is passed to _sys_rt_sigreturn points to pt_regs	20:56
stekern	then the pre-context-switch sp (that is pointing to a rt_sigframe) is loaded from that	20:57
dalias	how does r1 come to point to pt_regs when the signal handler returns and the restorer thunk makes the rt_sigreturn syscall?	21:06
dalias	oh maybe this is a new pt_regs from the syscall entry point	21:07
stekern	yes	21:07
dalias	i see	21:07
dalias	so this pt_regs struct is in kernelspace, saved by the syscall entry point	21:07
stekern	right	21:08
dalias	and the regs->sp is what the stack pointer in userspace pointed to at the time of the syscall	21:08
dalias	which is the rt_sigframe	21:08
dalias	so i don't see what's wrong.	21:08
dalias	the uc_sigmask is read from userspace and loaded	21:09
stekern	I'll amuse myself with passing some fun values in uc_sigmask (as you suggested earlier) and see if they come through right	21:10
dalias	ok	21:10
dalias	my best guess is that there's some stupid mismatch so musl is setting the wrong bit	21:11
olofk_	blueCmd_: Yeah, I've been thinking about writing a bridge for both axi4 and axi4lite, but haven't gotten around to it	21:24
olofk_	axi4lite is probably way easier	21:25
olofk_	Full axi4 would probably require Wishbone b4 to be somewhat efficient	21:25
-!- olofk_ is now known as olofk		21:25
blueCmd_	olofk: cool, in that case I think I will start working on it	21:46
blueCmd_	I have multiple cores in my design that has AXI4-Lite ports, so it makes sense	21:47
olofk	Yeah, the axi4 family is getting quite popular	21:50
olofk	It's main drawback is the license	21:50
blueCmd_	I haven't signed anything so.. :P	21:50
olofk	And the insane amount of signals for full ai4	21:50
blueCmd_	surprisingly hard to find even simple axi4 cores to use as a test against	21:52
blueCmd_	found http://opencores.org/project,axi_slave but that uses the dead RobustVerilog	21:52
stekern	this uc_sigmask isn't coming through as it should at all	22:34
dalias	looks like something is broken in the kernel then :-p	22:36
dalias	or in musl's structs for or1k mcontext, etc	22:36
dalias	......	22:36
dalias	look in bits/signal.h	22:37
dalias	the definition of mcontext_t has inconsistent size depending on #if defined(_GNU_SOURCE) \|\| defined(_BSD_SOURCE)	22:37
dalias	that's certainly a bug	22:37
dalias	it should be 35, not 41	22:37
dalias	and likewise for gregset_t i think	22:37
dalias	(34 rather than 40)	22:37
dalias	stekern, you still there?	22:46
dalias	i think i found your bug	22:46
dalias	i'm guessing you were counting bytes rather than longs, or something	22:46
stekern	yes, I see	22:48
stekern	counting wrong, that's what I was doing in any case ;)	22:49
dalias	my guess is you counted bytes for the extra 2 registers, then counted longs for the oldmask :-p	22:50
stekern	yeah, I must have counted pc+sr as bytes...	22:50
stekern	yup, this looks much better	22:53
dalias	:)	22:54
stekern	what a silly mistake... that's so typically me... on the bright side, I wouldn't have learned as much as I did if I wouldn't have made that silly mistake	22:55
blueCmd_	stekern: feel free to backport everything to glibc as well ...............	22:55
dalias	i don't think this issue affects glibc. it's just in stekern's bits/signal.h for musl	22:56
stekern	blueCmd_: backport my silly mistakes? nah, that doesn't sound like a good idea =P	22:56
blueCmd_	I mean, I'm super excited about doing that - but I wouldn't feel nice robbing you of the oppertunity	22:56
dalias	otoh backporting musl's cancellation to glibc would be very nice, since glibc's is unusably racy	22:56
blueCmd_	stekern: well, skip the mistakes and just port the fixes :P	22:56
dalias	(see http://ewontfix.com/16/)	22:57
blueCmd_	dalias: the amount of work you guys are putting in musl puts glibc to shame	22:57
blueCmd_	I hacked it together so I could run "normal" apps, that's about it	22:57
stekern	dalias: yes, I really liked the way you've implemented the cancellation points in musl	22:58
dalias	stekern, it took 2 or 3 tries to come up with this; i scrapped the first few implementations of cancallation and replaced them completely	22:58
blueCmd_	dalias: are there any plans / interest from the Debian community to adopt musl as it's primary libc?	22:58
dalias	i doubt it	22:59
dalias	might happen in the _really_ long term :)	22:59
blueCmd_	what about gentoo?	22:59
dalias	but debian is a HUGE distro and pretty conservative in what they do	22:59
blueCmd_	is it runnable with gentoo?	22:59
blueCmd_	I guess it is	22:59
dalias	there's a gentoo stage-whatever (i forget how their 'stage' numbering works) with musl as the system libc	22:59
dalias	not sure how mature it is tho	23:00
blueCmd_	cool. olofk will have to start porting it then :)	23:00
dalias	and alpine linux 3.x is using musl as the system libc	23:00
blueCmd_	dalias: well, we wouldn't want anything unstable	23:00
dalias	i've got alpine on my laptop now; it's pretty nice	23:00
blueCmd_	Debian for or1k crashes almost < 10 times per hour, so we're pretty proud over that	23:00
blueCmd_	;)	23:01
dalias	;-)	23:01
blueCmd_	It's kind of stable if you don't do anything	23:01
dalias	alpine's package selection is somewhat spartan, but it's fast, light, security-oriented, and has a responsive developer community	23:01
blueCmd_	unless you run stekern's SMP	23:01
blueCmd_	dalias: how big is the system footprint?	23:03
stekern	dalias: btw, due to another silly mistake (I think I hadn't save the file in between compiles), I have to retract the comment about "+m" working. it didn't like that at all	23:05
dalias	bluecmd_, my /usr is ~650 megs with xfce-based X setup. /lib is another 300 megs, mostly firmware and kernel modules	23:05
dalias	stekern, oh?	23:05
stekern	just "m" works though	23:05
dalias	weird	23:06
dalias	did you put the +m in output or input?	23:06
dalias	it would need to be in the output operands section, not the input one	23:06
dalias	despite it declaring an operand that's both input and output	23:06
stekern	ah, let's try again then	23:06
dalias	:)	23:06
dalias	bluecmd_, so you can have a pretty good useful gui system in ~1gb	23:08
blueCmd_	dalias: indeed	23:08
dalias	for servers, firewalls, etc. of course you can get it much much smaller	23:08
stekern	that it did like better	23:09
dalias	one of the things i'm most impressed with is how fast firefox starts	23:09
dalias	i really want to try to understand it	23:09
dalias	since in theory glibc's dynamic linker has all sorts of trickery to improve firefox load time	23:09
dalias	and musl's is completely naive	23:09
dalias	so i'm guessing either something fancy they do actually hurts performance a lot, or that the influence of dynamic linking in load time is over-hyped	23:11
stekern	blueCmd_: bah, my SMP WIP is stable, I had it running top for two days!	23:17
stekern	hmm, pthread_robust is timing out.	23:24
dalias	:(	23:29
stekern	well, some fun left for tomorrow then, way past bedtime here now =P	23:35
stekern	dalias: thanks again for all the help tracking down my bugs	23:35
dalias	np	23:47
--- Log closed Sat Jul 12 00:00:15 2014

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!