Discussion:
Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36
(too old to reply)
David Miller
2011-01-16 05:17:22 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Fri, 14 Jan 2011 16:08:30 +0000

[ Frederic, Steven, Ingo, the short version of the story is that we
need to make it such that the _ftrace_events section is aligned
properly for 64-bit systems, and in particular that GCC can see this
too. Otherwise GCC thinks 64-bit words will be unaligned and we get
all kinds of funky relocations being used in modules on sparc64 that
really there is never any reason to see. ]
They seem to come from scsi.o at
.asciz "scsi_eh_wakeup"
.align 4
.type event_scsi_eh_wakeup, #object
.size event_scsi_eh_wakeup, 136
.skip 16
.uaxword event_class_scsi_eh_wakeup
.uaxword .LLC51
.skip 8
.skip 40
.uaxword ftrace_event_type_funcs_scsi_eh_wakeup
.uaxword print_fmt_scsi_eh_wakeup
.skip 40
These ".uaxword" emissions are a bug, and if we let them stand then
ftrace events are going to load every long word object using a
sequence of byte-sized accesses. This will absolutely kill
performance.

This is exactly why I wanted to find out why this started happening
out of nowhere, instead of blindly adding support for new relocation
types to the sparc module loader.

The set of relocations supported by the sparc module loading code is
not meant to be "all relocations which are legal" but rather it's
meant to support the most minimum set of relocations that the kernel
actually _needs_ and should be making use of.

I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.

So we need to figure out why that is happening, and fix that instead.

Thanks.
Richard Mortimer
2011-01-16 14:17:49 UTC
Permalink
Post by David Miller
Date: Fri, 14 Jan 2011 16:08:30 +0000
[ Frederic, Steven, Ingo, the short version of the story is that we
need to make it such that the _ftrace_events section is aligned
properly for 64-bit systems, and in particular that GCC can see this
too. Otherwise GCC thinks 64-bit words will be unaligned and we get
all kinds of funky relocations being used in modules on sparc64 that
really there is never any reason to see. ]
They seem to come from scsi.o at
.asciz "scsi_eh_wakeup"
.align 4
.type event_scsi_eh_wakeup, #object
.size event_scsi_eh_wakeup, 136
.skip 16
.uaxword event_class_scsi_eh_wakeup
.uaxword .LLC51
.skip 8
.skip 40
.uaxword ftrace_event_type_funcs_scsi_eh_wakeup
.uaxword print_fmt_scsi_eh_wakeup
.skip 40
These ".uaxword" emissions are a bug, and if we let them stand then
ftrace events are going to load every long word object using a
sequence of byte-sized accesses. This will absolutely kill
performance.
This is exactly why I wanted to find out why this started happening
out of nowhere, instead of blindly adding support for new relocation
types to the sparc module loader.
The set of relocations supported by the sparc module loading code is
not meant to be "all relocations which are legal" but rather it's
meant to support the most minimum set of relocations that the kernel
actually _needs_ and should be making use of.
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
So we need to figure out why that is happening, and fix that instead.
Thanks.
I'm wondering if gcc is just getting better at honouring the source
code. The DEFINE_EVENT macros in include/trace/ftrace.h have a
__aligned__(4) attribute in them. Maybe that should be 8 on sparc64
systems.
The aligned 4 seems to be unchanged since include/trace/ftrace.h was
created in f42c85e74faa422cf0bc747ed808681145448f88 in April 2009.

example

#undef DEFINE_EVENT
#define DEFINE_EVENT(template, call, proto, args) \
\
static struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
.event.funcs = &ftrace_event_type_funcs_##template, \
.print_fmt = print_fmt_##template, \
};


I haven't tried making any changes/recompiling yet because that change
will pretty much force a full recompile and that takes upwards of 12
hours on the old sparc64 box that I have. I will wait for feedback
before I try anything.

Regards

Richard

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-16 19:39:44 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Sun, 16 Jan 2011 14:17:49 +0000
Post by Richard Mortimer
I'm wondering if gcc is just getting better at honouring the source
code. The DEFINE_EVENT macros in include/trace/ftrace.h have a
__aligned__(4) attribute in them. Maybe that should be 8 on sparc64
systems.
The aligned 4 seems to be unchanged since include/trace/ftrace.h was
created in f42c85e74faa422cf0bc747ed808681145448f88 in April 2009.
That needs to be at least "8" on 64-bit systems. Why is this aligned
directive there at all?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-17 14:11:26 UTC
Permalink
[ Added Mathieu on Cc, since he likes alignments ;-) ]
Post by David Miller
Date: Sun, 16 Jan 2011 14:17:49 +0000
Post by Richard Mortimer
I'm wondering if gcc is just getting better at honouring the source
code. The DEFINE_EVENT macros in include/trace/ftrace.h have a
__aligned__(4) attribute in them. Maybe that should be 8 on sparc64
systems.
The aligned 4 seems to be unchanged since include/trace/ftrace.h was
created in f42c85e74faa422cf0bc747ed808681145448f88 in April 2009.
That needs to be at least "8" on 64-bit systems. Why is this aligned
directive there at all?
IIRC, the problem showed up in 64-bit systems. OK, x86-64 (but of
course ;-).

The problem comes when the linker puts these sections together. We read
all the sections as one big array. If the linker puts in holes, then
this breaks the array, and the kernel crashes while reading the section.

I guess one solution is to remove the alignment at the allocation and
place it at the structure. This will mean all accesses to this structure
will need to be on an alignment.

-- Steve
Bastian Blank
2011-01-17 14:37:56 UTC
Permalink
Post by Steven Rostedt
The problem comes when the linker puts these sections together. We read
all the sections as one big array. If the linker puts in holes, then
this breaks the array, and the kernel crashes while reading the section.
I think this are still bugs in the linker script. #537862[1] was the
first occurance, but I don't find further information.

Bastian

[1]: http://bugs.debian.org/537862
--
What kind of love is that? Not to be loved; never to have shown love.
-- Commissioner Nancy Hedford, "Metamorphosis",
stardate 3219.8
Mathieu Desnoyers
2011-01-17 19:35:25 UTC
Permalink
Post by Steven Rostedt
[ Added Mathieu on Cc, since he likes alignments ;-) ]
Oh yes, alignments are so much fun! (for some definitions of fun) ;)
Post by Steven Rostedt
Post by David Miller
Date: Sun, 16 Jan 2011 14:17:49 +0000
Post by Richard Mortimer
I'm wondering if gcc is just getting better at honouring the source
code. The DEFINE_EVENT macros in include/trace/ftrace.h have a
__aligned__(4) attribute in them. Maybe that should be 8 on sparc64
systems.
The aligned 4 seems to be unchanged since include/trace/ftrace.h was
created in f42c85e74faa422cf0bc747ed808681145448f88 in April 2009.
That needs to be at least "8" on 64-bit systems. Why is this aligned
directive there at all?
IIRC, the problem showed up in 64-bit systems. OK, x86-64 (but of
course ;-).
The problem comes when the linker puts these sections together. We read
all the sections as one big array. If the linker puts in holes, then
this breaks the array, and the kernel crashes while reading the section.
I guess one solution is to remove the alignment at the allocation and
place it at the structure. This will mean all accesses to this structure
will need to be on an alignment.
The problem with these alignments is that they are just a hint to gcc, telling
it what the minimum alignment of a type should be. gcc is free to align on a
larger boundary if it wants to.

But the following test program is very instructive:

#include <stdio.h>

struct test {
void *a;
void *b;
void *c;
void *d;
void *e;
void *f;
void *g;
void *h;
void *i;
void *j;
void *k;
void *l;
void *m;
void *n;
void *o;
void *p;
void *q;
};

int main()
{
struct test __attribute__((aligned(4))) v;
printf("%d\n", __alignof__(v));
return 0;
}

(on x86_64, with gcc 4.5.1 and gcc 4.4.4)

if we put the "__attribute__((aligned(4)))" at the v definition (variable
attribute), the program returns an alignment of 4. If we move it after struct
test declaration (type attribute), the program returns an alignment of 8 (thus
taking the max between the attribute alignment and the largest field).

But that's a real problem, because in include/trace/ftrace.h, we have an
alignment of 4 forced on the definition, but there is a mismatch with
trace_events.c:

extern struct ftrace_event_call __start_ftrace_events[];
extern struct ftrace_event_call __stop_ftrace_events[];

for which the alignment attribute is missing (so an alignment of 8 will be
used there).

So it all worked as long as the size of struct ftrace_event_call was a multiple
of 8 bytes (struct ftrace_event_call constains 2 integers if we exclude the perf
fields), but the new fields added by perf contain a supplementary 4-byte
integer, which seems to be causing the breakage: the structures are appended one
next to another when defined, but the iteration on these structures thinks they
are 8-byte aligned.

Steven, what were you trying to fix in the first place when you added the
aligned(4) to the definition ? It might have just been that the _ftrace_events
section needed to be aligned on at least 8 bytes in the linker scripts, but was
only aligned on 4-bytes. Forcing the definition alignment down to 4 possibly
fixed the problem you experienced on x86_64, but seems to be causing other
problems.

I would recommend to:

- Keep the linker script _ftrace_events alignment as it is now (aligned on 32
bytes).
- Remove the aligned(4) attributes from all struct ftrace_event_call
definitions.

And see how this works. The only problem that might come up is if gcc decides to
align struct ftrace_event_call (which is about 136 bytes in size) on an
alignment larger than 32 bytes, which would be really surprising.

Mathieu
Post by Steven Rostedt
-- Steve
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:36:58 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Mon, 17 Jan 2011 14:35:25 -0500
Post by Mathieu Desnoyers
Steven, what were you trying to fix in the first place when you added the
aligned(4) to the definition ? It might have just been that the _ftrace_events
section needed to be aligned on at least 8 bytes in the linker scripts, but was
only aligned on 4-bytes. Forcing the definition alignment down to 4 possibly
fixed the problem you experienced on x86_64, but seems to be causing other
problems.
- Keep the linker script _ftrace_events alignment as it is now (aligned on 32
bytes).
- Remove the aligned(4) attributes from all struct ftrace_event_call
definitions.
Completely agreed.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 05:34:48 UTC
Permalink
From: Steven Rostedt <***@goodmis.org>
Date: Mon, 17 Jan 2011 09:11:26 -0500
Post by Steven Rostedt
The problem comes when the linker puts these sections together. We read
all the sections as one big array. If the linker puts in holes, then
this breaks the array, and the kernel crashes while reading the section.
Ummm, this sounds like a serious binutils bug.

If it's doing this, than things like constructor and destructor
sections will not work properly. Those are constructed in the
same exact way, objects with constructors register a pointer into
a special section (".ctors" or something like that) and all such
section contents are linked together into an array for the final
binary. crt0.S and similar code on program startup walks the
array and executes every pointer in that array.

Or is the problem, rather, that you're storing different kinds of data
structures into this section? Is the problem that they turn out to
have different sizes and this is what disturbs the array walk?

I really don't understand what the problem is, and the align(4)
fix is definitely the wrong thing to do for several reasons.

Where are these "holes" coming from? Reading the commit message for
the change that introduced this problem
(86c38a31aa7f2dd6e74a262710bf8ebf7455acc5), it seems like the issue is
coming from the compiler, and that fact that beforehand some
structures were marked with 4-byte alignment. The existing 4-byte
alignment cases sound like the real bug to me, where is that happening
and why?

If you want these things to be in the array, you have to define them
all identically. But down-aligning structures with pointers in them
is definitely not the right fix.


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:00:39 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Mon, 17 Jan 2011 21:34:48 -0800 (PST)
Post by David Miller
Where are these "holes" coming from? Reading the commit message for
the change that introduced this problem
(86c38a31aa7f2dd6e74a262710bf8ebf7455acc5), it seems like the issue is
coming from the compiler, and that fact that beforehand some
structures were marked with 4-byte alignment. The existing 4-byte
alignment cases sound like the real bug to me, where is that happening
and why?
Ok, now I'm getting a bit irritated. I went and looked into the
history of this "aligned(4)" tag on ftrace_event_call.

The reason for it is completely undocumented, and in fact it gets
added in a commit whose commit message doesn't mention this part
of the commit's change at all.

That really sucks.

The commit message is:

commit 1473e4417c79f12d91ef91a469699bfa911f510f
Author: Steven Rostedt <***@redhat.com>
Date: Tue Feb 24 14:15:08 2009 -0500

tracing: make event directory structure

This patch adds the directory /debug/tracing/events/ that will contain
all the registered trace points.

# ls /debug/tracing/events/
sched_kthread_stop sched_process_fork sched_switch
sched_kthread_stop_ret sched_process_free sched_wait_task
sched_migrate_task sched_process_wait sched_wakeup
sched_process_exit sched_signal_send sched_wakeup_new

# ls /debug/tracing/events/sched_switch/
enable

# cat /debug/tracing/events/sched_switch/enable
1

# cat /debug/tracing/set_event
sched_switch

Signed-off-by: Steven Rostedt <***@redhat.com>

And in this commit we have:

@@ -39,6 +41,7 @@ static void ftrace_unreg_event_##call(void) \
} \
\
static struct ftrace_event_call __used \
+__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.regfunc = ftrace_reg_event_##call, \

Excuse me?

This is a serious infraction of responsible development, and it makes
bugs hard if not impossible to analyze. I see it was postedo on LKML,
and I'm surprised nobody asked "Why are you adding that alignment, and
whatever the reason why isn't it mentioned in the commit message or in
a comment?"

Anyways, this commit is where this alignment originates.

It probably is spurious and unnecessary and we can remove it
completely from _ALL_ of the cases. Did anyone do any research
whatsoever into why the alignment tag was put there in the first place
when the GCC 4.5 bug showed up?

Steven what is the deal here?

Can't we just do the following patch? Also, I notice similar
alignment tag being used for syscall_metadata, what's the story there?

--------------------
ftrace: Remove unnecessary alignment tag from ftrace_event_call.

It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.

Signed-off-by: David S. Miller <***@davemloft.net>

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 18cd068..05295f2 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -128,7 +128,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
static struct syscall_metadata \
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \
event_enter_##sname = { \
.name = "sys_enter"#sname, \
@@ -142,7 +141,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
static struct syscall_metadata \
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \
event_exit_##sname = { \
.name = "sys_exit"#sname, \
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index e16610c..bd86d1b 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -68,8 +68,7 @@

#undef DEFINE_EVENT
#define DEFINE_EVENT(template, name, proto, args) \
- static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) event_##name
+ static struct ftrace_event_call __used event_##name

#undef DEFINE_EVENT_PRINT
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
@@ -447,7 +446,6 @@ static inline notrace int ftrace_get_offsets_##call( \
* };
*
* static struct ftrace_event_call __used
- * __attribute__((__aligned__(4)))
* __attribute__((section("_ftrace_events"))) event_<call> = {
* .name = "<call>",
* .class = event_class_<template>,
@@ -580,7 +578,6 @@ static struct ftrace_event_class __used event_class_##call = { \
#define DEFINE_EVENT(template, call, proto, args) \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
@@ -594,7 +591,6 @@ __attribute__((section("_ftrace_events"))) event_##call = { \
static const char print_fmt_##call[] = print; \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:08:57 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Mon, 17 Jan 2011 22:00:39 -0800 (PST)
Post by David Miller
ftrace: Remove unnecessary alignment tag from ftrace_event_call.
It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.
...

Ok, unless we can explain why these alignments are needed at all, we
should kill all of them:

--------------------
tracing: Remove unnecessary __aligned__(4) tag from trace data.

It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.

Signed-off-by: David S. Miller <***@davemloft.net>

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 320d6c9..8620723 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -94,7 +94,6 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
#define __branch_check__(x, expect) ({ \
int ______r; \
static struct ftrace_branch_data \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_annotated_branch"))) \
______f = { \
.func = __func__, \
@@ -129,7 +128,6 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
({ \
int ______r; \
static struct ftrace_branch_data \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_branch"))) \
______f = { \
.func = __func__, \
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 18cd068..e23a188 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -125,10 +125,8 @@ extern struct trace_event_functions enter_syscall_print_funcs;
extern struct trace_event_functions exit_syscall_print_funcs;

#define SYSCALL_TRACE_ENTER_EVENT(sname) \
- static struct syscall_metadata \
- __attribute__((__aligned__(4))) __syscall_meta_##sname; \
+ static struct syscall_metadata __syscall_meta_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \
event_enter_##sname = { \
.name = "sys_enter"#sname, \
@@ -139,10 +137,8 @@ extern struct trace_event_functions exit_syscall_print_funcs;
__TRACE_EVENT_FLAGS(enter_##sname, TRACE_EVENT_FL_CAP_ANY)

#define SYSCALL_TRACE_EXIT_EVENT(sname) \
- static struct syscall_metadata \
- __attribute__((__aligned__(4))) __syscall_meta_##sname; \
+ static struct syscall_metadata __syscall_meta_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \
event_exit_##sname = { \
.name = "sys_exit"#sname, \
@@ -156,7 +152,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
SYSCALL_TRACE_ENTER_EVENT(sname); \
SYSCALL_TRACE_EXIT_EVENT(sname); \
static struct syscall_metadata __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta_##sname = { \
.name = "sys"#sname, \
@@ -172,7 +167,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
SYSCALL_TRACE_ENTER_EVENT(_##sname); \
SYSCALL_TRACE_EXIT_EVENT(_##sname); \
static struct syscall_metadata __used \
- __attribute__((__aligned__(4))) \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta__##sname = { \
.name = "sys_"#sname, \
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index e16610c..bd86d1b 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -68,8 +68,7 @@

#undef DEFINE_EVENT
#define DEFINE_EVENT(template, name, proto, args) \
- static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) event_##name
+ static struct ftrace_event_call __used event_##name

#undef DEFINE_EVENT_PRINT
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
@@ -447,7 +446,6 @@ static inline notrace int ftrace_get_offsets_##call( \
* };
*
* static struct ftrace_event_call __used
- * __attribute__((__aligned__(4)))
* __attribute__((section("_ftrace_events"))) event_<call> = {
* .name = "<call>",
* .class = event_class_<template>,
@@ -580,7 +578,6 @@ static struct ftrace_event_class __used event_class_##call = { \
#define DEFINE_EVENT(template, call, proto, args) \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
@@ -594,7 +591,6 @@ __attribute__((section("_ftrace_events"))) event_##call = { \
static const char print_fmt_##call[] = print; \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-18 16:46:33 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 22:00:39 -0800 (PST)
Post by David Miller
ftrace: Remove unnecessary alignment tag from ftrace_event_call.
It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.
...
Ok, unless we can explain why these alignments are needed at all, we
ftrace: linker script add missing struct align

We should add the missing "STRUCT_ALIGN();" in
include/asm-generic/vmlinux.lds.h as a preliminary step to remove the ftrace
bogus structure alignments. Moving all STRUCT_ALIGN() for FTRACE_EVENTS()
and TRACE_SYSCALLS() into the definitions, so the alignment is only done if
these infrastructures are configured in.

Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.

Signed-off-by: Mathieu Desnoyers <***@efficios.com>
---
include/asm-generic/vmlinux.lds.h | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-generic/vmlinux.lds.h
+++ linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
@@ -107,7 +107,8 @@
#endif

#ifdef CONFIG_TRACE_BRANCH_PROFILING
-#define LIKELY_PROFILE() VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
+#define LIKELY_PROFILE() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
*(_ftrace_annotated_branch) \
VMLINUX_SYMBOL(__stop_annotated_branch_profile) = .;
#else
@@ -115,7 +116,8 @@
#endif

#ifdef CONFIG_PROFILE_ALL_BRANCHES
-#define BRANCH_PROFILE() VMLINUX_SYMBOL(__start_branch_profile) = .; \
+#define BRANCH_PROFILE() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_branch_profile) = .; \
*(_ftrace_branch) \
VMLINUX_SYMBOL(__stop_branch_profile) = .;
#else
@@ -123,7 +125,8 @@
#endif

#ifdef CONFIG_EVENT_TRACING
-#define FTRACE_EVENTS() VMLINUX_SYMBOL(__start_ftrace_events) = .; \
+#define FTRACE_EVENTS() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_ftrace_events) = .; \
*(_ftrace_events) \
VMLINUX_SYMBOL(__stop_ftrace_events) = .;
#else
@@ -131,7 +134,8 @@
#endif

#ifdef CONFIG_TRACING
-#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; \
+#define TRACE_PRINTKS() . = ALIGN(8); \
+ VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; \
*(__trace_printk_fmt) /* Trace_printk fmt' pointer */ \
VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
#else
@@ -139,7 +143,8 @@
#endif

#ifdef CONFIG_FTRACE_SYSCALLS
-#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
+#define TRACE_SYSCALLS() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
*(__syscalls_metadata) \
VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
#else
@@ -169,11 +174,7 @@
LIKELY_PROFILE() \
BRANCH_PROFILE() \
TRACE_PRINTKS() \
- \
- STRUCT_ALIGN(); \
FTRACE_EVENTS() \
- \
- STRUCT_ALIGN(); \
TRACE_SYSCALLS()

/*
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 17:33:39 UTC
Permalink
Post by Mathieu Desnoyers
Post by David Miller
Date: Mon, 17 Jan 2011 22:00:39 -0800 (PST)
Post by David Miller
ftrace: Remove unnecessary alignment tag from ftrace_event_call.
It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.
...
Ok, unless we can explain why these alignments are needed at all, we
ftrace: linker script add missing struct align
We should add the missing "STRUCT_ALIGN();" in
include/asm-generic/vmlinux.lds.h as a preliminary step to remove the ftrace
bogus structure alignments. Moving all STRUCT_ALIGN() for FTRACE_EVENTS()
and TRACE_SYSCALLS() into the definitions, so the alignment is only done if
these infrastructures are configured in.
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.

Thanks,

-- Steve
Post by Mathieu Desnoyers
---
include/asm-generic/vmlinux.lds.h | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
Index: linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-generic/vmlinux.lds.h
+++ linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
@@ -107,7 +107,8 @@
#endif
#ifdef CONFIG_TRACE_BRANCH_PROFILING
-#define LIKELY_PROFILE() VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
+#define LIKELY_PROFILE() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
*(_ftrace_annotated_branch) \
VMLINUX_SYMBOL(__stop_annotated_branch_profile) = .;
#else
@@ -115,7 +116,8 @@
#endif
#ifdef CONFIG_PROFILE_ALL_BRANCHES
-#define BRANCH_PROFILE() VMLINUX_SYMBOL(__start_branch_profile) = .; \
+#define BRANCH_PROFILE() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_branch_profile) = .; \
*(_ftrace_branch) \
VMLINUX_SYMBOL(__stop_branch_profile) = .;
#else
@@ -123,7 +125,8 @@
#endif
#ifdef CONFIG_EVENT_TRACING
-#define FTRACE_EVENTS() VMLINUX_SYMBOL(__start_ftrace_events) = .; \
+#define FTRACE_EVENTS() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_ftrace_events) = .; \
*(_ftrace_events) \
VMLINUX_SYMBOL(__stop_ftrace_events) = .;
#else
@@ -131,7 +134,8 @@
#endif
#ifdef CONFIG_TRACING
-#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; \
+#define TRACE_PRINTKS() . = ALIGN(8); \
+ VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; \
*(__trace_printk_fmt) /* Trace_printk fmt' pointer */ \
VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
#else
@@ -139,7 +143,8 @@
#endif
#ifdef CONFIG_FTRACE_SYSCALLS
-#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
+#define TRACE_SYSCALLS() STRUCT_ALIGN(); \
+ VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
*(__syscalls_metadata) \
VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
#else
@@ -169,11 +174,7 @@
LIKELY_PROFILE() \
BRANCH_PROFILE() \
TRACE_PRINTKS() \
- \
- STRUCT_ALIGN(); \
FTRACE_EVENTS() \
- \
- STRUCT_ALIGN(); \
TRACE_SYSCALLS()
/*
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 18:16:35 UTC
Permalink
Post by Steven Rostedt
Post by Mathieu Desnoyers
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 18:26:18 UTC
Permalink
Post by Steven Rostedt
Post by Steven Rostedt
Post by Mathieu Desnoyers
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.
Your patch doesn't seem to fix it either. I'll investigate this further.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-18 20:13:23 UTC
Permalink
Post by Steven Rostedt
Post by Steven Rostedt
Post by Steven Rostedt
Post by Mathieu Desnoyers
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.
Your patch doesn't seem to fix it either. I'll investigate this further.
I think David's patch missed kernel/trace/trace_export.c

struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \

and kernel/trace/trace.h:

#define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
extern struct ftrace_event_call \
__attribute__((__aligned__(4))) event_##call;

does it help if you remove these ?

Mathieu
Post by Steven Rostedt
-- Steve
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 20:22:24 UTC
Permalink
Post by Mathieu Desnoyers
Post by Steven Rostedt
Post by Steven Rostedt
Post by Steven Rostedt
Post by Mathieu Desnoyers
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.
Your patch doesn't seem to fix it either. I'll investigate this further.
I think David's patch missed kernel/trace/trace_export.c
struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
#define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
extern struct ftrace_event_call \
__attribute__((__aligned__(4))) event_##call;
does it help if you remove these ?
I doubt it, but I'll try it anyway.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 05:08:45 UTC
Permalink
Post by Steven Rostedt
Post by Mathieu Desnoyers
Post by Steven Rostedt
Post by Steven Rostedt
Post by Steven Rostedt
Post by Mathieu Desnoyers
Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section is
aligned on pointer size.
If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.
Your patch doesn't seem to fix it either. I'll investigate this further.
I think David's patch missed kernel/trace/trace_export.c
struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) event_##call = { \
#define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
extern struct ftrace_event_call \
__attribute__((__aligned__(4))) event_##call;
does it help if you remove these ?
I doubt it, but I'll try it anyway.
The following works fine for me now. Comments are welcome.


tracing: use __aligned__() variable and type attributes to work around gcc bug

I just played with the possible combinations, and here are my current results,
with the appropriately aligned linker script sections (with my patch applied):

- No aligned() type attribute nor variable attribute. I get a crash on x86_64
(NULL pointer exception when executing __trace_add_event_call, the 5th call).
__alignof__(struct ftrace_event_call) is worth 8.

- adding type attribute aligned(32) on the struct ftrace_event_call and
struct syscall_metadata declarations seems to work fine, but consumes space
needlessly (struct ftrace_event_call grows from 136 to 160 bytes). Note that
tracepoints use the type attribute aligned(32), which falls in this category
(waste of space). We might want to do better.

- adding type attribute aligned(8) crashes. Note that the gcc documentation for
type attributes explains that this specifies the minimum alignment for the
type, so the compiler is free to choose a larger one.

- adding type attribute (packed, aligned(8)) should force the compiler to
consider a 8-byte alignment for the type (as shown by __alignof__), but it
doesn't work (crash).

I really don't like specifying alignment as variable attributes (rather than
type attributes), because the extern array that iterates on the attributes has
no clue of the alignment used (same for pointer arithmetic). Unfortunately, it
seems to be the only way to really make sure gcc does not use an alignment
larger than prescribed.

So far, my somewhat premature conclusion is that gcc honours the aligned() type
attribute for the array iteration (thus expecting a 136 bytes array aligned on 8
bytes boundaries), but does not apply the aligned() type attribute to the
structure definition when used in combination with the "section" variable
attribute, choosing to align the structure on 32 bytes instead. My hunch is thatthis is a gcc bug that appeared a few months before Fri Nov 14 17:47:47 2008
-0500, which led me to change the tracepoint structure alignment from 8 to 32
in commit 7e066fb870fcd1025ec3ba7bbde5d541094f4ce1 because of a very similar
problem.

One way to work around this could be to specify the aligned(8) type attribute to
the structure declarations *and* the aligned(8) variable attribute to the
structure definitions.

On 32-bit architectures, we really want a aligned(4), and on 64-bit
architectures, aligned(8). Represent this by creating:

#define __long_aligned __attribute__((__aligned__(__alignof__(long))))

The following patch passes the initial smoke test (it boots fine, without any
NULL pointer exception on x86_64).
(this patch is based on a 2.6.37 kernel)

Signed-off-by: Mathieu Desnoyers <***@efficios.com>
---
include/linux/compiler.h | 8 +++++---
include/linux/ftrace_event.h | 2 +-
include/linux/syscalls.h | 20 ++++++++++++--------
include/trace/ftrace.h | 8 ++++----
include/trace/syscall.h | 2 +-
kernel/trace/trace.h | 2 +-
kernel/trace/trace_export.c | 2 +-
7 files changed, 25 insertions(+), 19 deletions(-)

Index: linux-2.6-lttng/include/linux/compiler.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/compiler.h
+++ linux-2.6-lttng/include/linux/compiler.h
@@ -57,6 +57,8 @@ extern void __chk_io_ptr(const volatile
# include <linux/compiler-intel.h>
#endif

+#define __long_aligned __attribute__((__aligned__(__alignof__(long))))
+
/*
* Generic compiler-dependent macros required for kernel
* build go below this comment. Actual compiler/compiler version
@@ -78,7 +80,7 @@ struct ftrace_branch_data {
};
unsigned long miss_hit[2];
};
-};
+} __long_aligned;

/*
* Note: DISABLE_BRANCH_PROFILING can be used by special lowlevel code
@@ -94,7 +96,7 @@ void ftrace_likely_update(struct ftrace_
#define __branch_check__(x, expect) ({ \
int ______r; \
static struct ftrace_branch_data \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("_ftrace_annotated_branch"))) \
______f = { \
.func = __func__, \
@@ -129,7 +131,7 @@ void ftrace_likely_update(struct ftrace_
({ \
int ______r; \
static struct ftrace_branch_data \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("_ftrace_branch"))) \
______f = { \
.func = __func__, \
Index: linux-2.6-lttng/include/linux/syscalls.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/syscalls.h
+++ linux-2.6-lttng/include/linux/syscalls.h
@@ -126,11 +126,13 @@ extern struct trace_event_functions exit

#define SYSCALL_TRACE_ENTER_EVENT(sname) \
static struct syscall_metadata \
- __attribute__((__aligned__(4))) __syscall_meta_##sname; \
+ __long_aligned \
+ __syscall_meta_##sname; \
static struct ftrace_event_call \
- __attribute__((__aligned__(4))) event_enter_##sname; \
+ __long_aligned \
+ event_enter_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("_ftrace_events"))) \
event_enter_##sname = { \
.name = "sys_enter"#sname, \
@@ -141,11 +143,13 @@ extern struct trace_event_functions exit

#define SYSCALL_TRACE_EXIT_EVENT(sname) \
static struct syscall_metadata \
- __attribute__((__aligned__(4))) __syscall_meta_##sname; \
+ __long_aligned \
+ __syscall_meta_##sname; \
static struct ftrace_event_call \
- __attribute__((__aligned__(4))) event_exit_##sname; \
+ __long_aligned \
+ event_exit_##sname; \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("_ftrace_events"))) \
event_exit_##sname = { \
.name = "sys_exit"#sname, \
@@ -158,7 +162,7 @@ extern struct trace_event_functions exit
SYSCALL_TRACE_ENTER_EVENT(sname); \
SYSCALL_TRACE_EXIT_EVENT(sname); \
static struct syscall_metadata __used \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta_##sname = { \
.name = "sys"#sname, \
@@ -174,7 +178,7 @@ extern struct trace_event_functions exit
SYSCALL_TRACE_ENTER_EVENT(_##sname); \
SYSCALL_TRACE_EXIT_EVENT(_##sname); \
static struct syscall_metadata __used \
- __attribute__((__aligned__(4))) \
+ __long_aligned \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta__##sname = { \
.name = "sys_"#sname, \
Index: linux-2.6-lttng/include/trace/ftrace.h
===================================================================
--- linux-2.6-lttng.orig/include/trace/ftrace.h
+++ linux-2.6-lttng/include/trace/ftrace.h
@@ -79,7 +79,7 @@
#undef DEFINE_EVENT
#define DEFINE_EVENT(template, name, proto, args) \
static struct ftrace_event_call __used \
- __attribute__((__aligned__(4))) event_##name;
+ __long_aligned event_##name;

#undef DEFINE_EVENT_PRINT
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
@@ -447,7 +447,7 @@ static inline notrace int ftrace_get_off
* };
*
* static struct ftrace_event_call __used
- * __attribute__((__aligned__(4)))
+ * __long_aligned
* __attribute__((section("_ftrace_events"))) event_<call> = {
* .name = "<call>",
* .class = event_class_<template>,
@@ -581,7 +581,7 @@ static struct ftrace_event_class __used
#define DEFINE_EVENT(template, call, proto, args) \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
+__long_aligned \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
@@ -595,7 +595,7 @@ __attribute__((section("_ftrace_events")
static const char print_fmt_##call[] = print; \
\
static struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
+__long_aligned \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.class = &event_class_##template, \
Index: linux-2.6-lttng/kernel/trace/trace.h
===================================================================
--- linux-2.6-lttng.orig/kernel/trace/trace.h
+++ linux-2.6-lttng/kernel/trace/trace.h
@@ -749,7 +749,7 @@ extern const char *__stop___trace_bprint
#undef FTRACE_ENTRY
#define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
extern struct ftrace_event_call \
- __attribute__((__aligned__(4))) event_##call;
+ __long_aligned event_##call;
#undef FTRACE_ENTRY_DUP
#define FTRACE_ENTRY_DUP(call, struct_name, id, tstruct, print) \
FTRACE_ENTRY(call, struct_name, id, PARAMS(tstruct), PARAMS(print))
Index: linux-2.6-lttng/kernel/trace/trace_export.c
===================================================================
--- linux-2.6-lttng.orig/kernel/trace/trace_export.c
+++ linux-2.6-lttng/kernel/trace/trace_export.c
@@ -156,7 +156,7 @@ struct ftrace_event_class event_class_ft
}; \
\
struct ftrace_event_call __used \
-__attribute__((__aligned__(4))) \
+__long_aligned \
__attribute__((section("_ftrace_events"))) event_##call = { \
.name = #call, \
.event.type = etype, \
Index: linux-2.6-lttng/include/linux/ftrace_event.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/ftrace_event.h
+++ linux-2.6-lttng/include/linux/ftrace_event.h
@@ -194,7 +194,7 @@ struct ftrace_event_call {
int perf_refcount;
struct hlist_head __percpu *perf_events;
#endif
-};
+} __long_aligned;

#define PERF_MAX_TRACE_SIZE 2048

Index: linux-2.6-lttng/include/trace/syscall.h
===================================================================
--- linux-2.6-lttng.orig/include/trace/syscall.h
+++ linux-2.6-lttng/include/trace/syscall.h
@@ -29,7 +29,7 @@ struct syscall_metadata {

struct ftrace_event_call *enter_event;
struct ftrace_event_call *exit_event;
-};
+} __long_aligned;

#ifdef CONFIG_FTRACE_SYSCALLS
extern unsigned long arch_syscall_addr(int nr);
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
David Miller
2011-01-19 05:16:39 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 00:08:45 -0500
Post by Mathieu Desnoyers
The following works fine for me now. Comments are welcome.
Thanks for doing this work Mathieu.
Post by Mathieu Desnoyers
- No aligned() type attribute nor variable attribute. I get a crash on x86_64
(NULL pointer exception when executing __trace_add_event_call, the 5th call).
__alignof__(struct ftrace_event_call) is worth 8.
This is really bizarre. Does it only happen on x86_64?

I'm wondering if GCC does something bizarre like work with different
default alignments based upon the section or something like that.

If so, maybe adding the section attribute to the array definition will
"fix" things?
Post by Mathieu Desnoyers
On 32-bit architectures, we really want a aligned(4), and on 64-bit
#define __long_aligned __attribute__((__aligned__(__alignof__(long))))
Do any of these datastructures have, or will have, "u64" or "long
long" types in them? If so, then we will need to use "8"
unconditionally or "__alignof__(long long)".

I'll see if I can work out why using no align directive explodes
on x86-64.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 15:10:04 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 00:08:45 -0500
Post by Mathieu Desnoyers
The following works fine for me now. Comments are welcome.
Thanks for doing this work Mathieu.
Post by Mathieu Desnoyers
- No aligned() type attribute nor variable attribute. I get a crash on x86_64
(NULL pointer exception when executing __trace_add_event_call, the 5th call).
__alignof__(struct ftrace_event_call) is worth 8.
This is really bizarre. Does it only happen on x86_64?
Sadly, my ppc32 test machine is currently broken, so I could not check on other
than x86 archs.
Post by David Miller
I'm wondering if GCC does something bizarre like work with different
default alignments based upon the section or something like that.
If so, maybe adding the section attribute to the array definition will
"fix" things?
Well, I thought about it in my sleep, and it looks like gcc is within its rights
to align these statically declared structures on a larger alignment: gcc has no
clue that we're going to do tricks with the linker to access the structures as
an array, so aligning on a larger alignment *should* be fine for the compiler,
but we suffer because we're doing something non-standard.
Post by David Miller
Post by Mathieu Desnoyers
On 32-bit architectures, we really want a aligned(4), and on 64-bit
#define __long_aligned __attribute__((__aligned__(__alignof__(long))))
Do any of these datastructures have, or will have, "u64" or "long
long" types in them? If so, then we will need to use "8"
unconditionally or "__alignof__(long long)".
If my memory serves me correctly, I think "long long" is aligned on 4 bytes on
ppc32, but on 8 bytes on x86_32 (yeah, that's weird). How about we create a
#define __long_long_aligned __attribute__((__aligned__(__alignof__(long long))))

?

Thanks,

Mathieu
Post by David Miller
I'll see if I can work out why using no align directive explodes
on x86-64.
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sam Ravnborg
2011-01-19 16:14:55 UTC
Permalink
Post by Mathieu Desnoyers
If my memory serves me correctly, I think "long long" is aligned on 4 bytes on
ppc32, but on 8 bytes on x86_32 (yeah, that's weird). How about we create a
#define __long_long_aligned __attribute__((__aligned__(__alignof__(long long))))
#define __u64_aligned __attribute__((__aligned__(__alignof__(long long))))

A bit shorter but maybe less obvious.

Sam
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 16:18:57 UTC
Permalink
Post by Sam Ravnborg
Post by Mathieu Desnoyers
If my memory serves me correctly, I think "long long" is aligned on 4 bytes on
ppc32, but on 8 bytes on x86_32 (yeah, that's weird). How about we create a
#define __long_long_aligned __attribute__((__aligned__(__alignof__(long long))))
#define __u64_aligned __attribute__((__aligned__(__alignof__(long long))))
A bit shorter but maybe less obvious.
Yep, that would make sense.

I'm tempted to try creating

#defined __u64_packed_aligned __attribute__((__packed__, __aligned__(__alignof__(long long))))

in the hope that gcc sees this as a strict alignment requirement (including a
max bound) rather than just a hint. From what I gather in my reading of

http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html

"The aligned attribute can only increase the alignment; but you can decrease it
by specifying packed as well. See below."

gcc seems to support having both specified. I think this would provide the kind
of alignment guarantees we really need here: both specifying the minimum _and_
maximum alignment.

Thoughts ?

Mathieu
Post by Sam Ravnborg
Sam
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 06:32:47 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 00:08:45 -0500
Post by Mathieu Desnoyers
- No aligned() type attribute nor variable attribute. I get a crash on x86_64
(NULL pointer exception when executing __trace_add_event_call, the 5th call).
__alignof__(struct ftrace_event_call) is worth 8.
I think I figured it out.

It's the static vs. non-static thing, or some other crazyness wrt.
how x86-64 implements it's alignment rules.

GCC on x86-64 uses a completely different policy for aligning local
(ie. "static") data objects vs. globally visible ones, for one thing.
It also has different alignment policies for objects that are part
of an array vs. those which are not.

On both counts, we're lying to the compiler, so maybe it's sort of our
fault.

As far as GCC can see, the object is static and also not part of an
array or any other C construct for which things like this could matter
as long as the alignment it chooses meets the minimum alignment
requirements of the ABI.

So it doesn't let us do this trick where we put the individual event
markers into a special section, yet mark it __used and static, then
later access them as if they were part of a globally visible array.

If you look these static objects, they are emitted with a leading
".align 32" directive. This is what screws everything up.

When the linker sees this, it aligns the start of every individual
"_ftrace_events" section, and that's where the "gaps" come from and
the crashes.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 07:20:45 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Tue, 18 Jan 2011 22:32:47 -0800 (PST)
Post by David Miller
As far as GCC can see, the object is static and also not part of an
array or any other C construct for which things like this could matter
as long as the alignment it chooses meets the minimum alignment
requirements of the ABI.
So it doesn't let us do this trick where we put the individual event
markers into a special section, yet mark it __used and static, then
later access them as if they were part of a globally visible array.
If you look these static objects, they are emitted with a leading
".align 32" directive. This is what screws everything up.
When the linker sees this, it aligns the start of every individual
"_ftrace_events" section, and that's where the "gaps" come from and
the crashes.
I've come up with one possible way to deal with this.

Put pointers to the trace objects into the special section, and
interate over those instead.

I was wondering why this x86-64 weird alignment behavior doesn't bite
us for our init funcs. And the reason is that all of these weird
alignment cases only trigger for aggregates (ie. structs and unions).

So we could do:

static struct ftace_event_call foo_event = { ... };
static struct ftrace_event_call * __used
__attribute__((section("_ftrace_event_ptrs")))
foo_event_ptr = &foo_event;

and

extern struct ftrace_event_call *__start_ftrace_event_ptrs[];
extern struct ftrace_event_call *__end_ftrace_event_ptrs[];

struct ftrace_event_call **p = __start_ftrace_event_ptrs;
while (p < &__end_ftrace_event_ptrs[0]) {
struct ftrace_event_call *event = *p++;

__trace_add_event_call(event, ...);
}

you get the idea.

And we could mark this entire point section as "initdata" and thus
free'able after bootup and post module load.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 15:33:26 UTC
Permalink
Post by David Miller
Date: Tue, 18 Jan 2011 22:32:47 -0800 (PST)
Post by David Miller
As far as GCC can see, the object is static and also not part of an
array or any other C construct for which things like this could matter
as long as the alignment it chooses meets the minimum alignment
requirements of the ABI.
So it doesn't let us do this trick where we put the individual event
markers into a special section, yet mark it __used and static, then
later access them as if they were part of a globally visible array.
If you look these static objects, they are emitted with a leading
".align 32" directive. This is what screws everything up.
When the linker sees this, it aligns the start of every individual
"_ftrace_events" section, and that's where the "gaps" come from and
the crashes.
I've come up with one possible way to deal with this.
Put pointers to the trace objects into the special section, and
interate over those instead.
I was wondering why this x86-64 weird alignment behavior doesn't bite
us for our init funcs. And the reason is that all of these weird
alignment cases only trigger for aggregates (ie. structs and unions).
static struct ftace_event_call foo_event = { ... };
static struct ftrace_event_call * __used
__attribute__((section("_ftrace_event_ptrs")))
foo_event_ptr = &foo_event;
and
extern struct ftrace_event_call *__start_ftrace_event_ptrs[];
extern struct ftrace_event_call *__end_ftrace_event_ptrs[];
struct ftrace_event_call **p = __start_ftrace_event_ptrs;
while (p < &__end_ftrace_event_ptrs[0]) {
struct ftrace_event_call *event = *p++;
__trace_add_event_call(event, ...);
}
you get the idea.
And we could mark this entire point section as "initdata" and thus
free'able after bootup and post module load.
There are three downsides to this approach compared to forcing both the type and
variable alignments with attributes:

1) It consumes extra space: This approach lets gcc align foo_event on 32-byte
boundaries, which is unneeded in this case. For structures repeated very
often, this is a bad thing.

2) It mixes .data and struct ftrace_event_call definitions, thus polluting .data
cache-lines (actually, the 32-byte alignment will leave some of these
cachelines partly unused). This would be fixable by specifying yet another
section name to hold the struct ftace_event_call definitions.

3) Freeing _ftrace_event_ptrs is only possible after init here because Ftrace
data layout is sub-optimal. The linked-list created at init time by Ftrace
kind of duplicates the arrays already implicit to the section. If we look at
Tracepoints for example, which present the exact same section alignment
problem, we iterate on the tracepoint section each time a tracepoint is
activated or deactivated. So we need to keep the section there after init.
Therefore, your indirection approach would grow the data footprint.
The trade-off here is that walking the table is a very rare operation that
does not need to be fast, so we accept the performance hit of walking the
tracepoint table for each enabled tracepoint to shrink the memory footprint.

Especially for reasons (1) and (2), I'd be tempted to go for the
__long_long_aligned type and variable attribute instead. Thoughts ?

I'm still unsure that __long_long_aligned is needed over __long_aligned though.
AFAIK, the only requirement we have for, e.g. tracepoints, is to align on the
pointer size (sizeof(long)), so RCU pointer updates are performed atomically.
Aligning on the pointer size also allows the architecture to efficiently read
the field content. What does aligning on sizeof(long long) bring to us ? Is it
that you are concerned about the fact that the "aligned" type attribute, when
applied to a structure, is only used as a lower-bound by the compiler ? In that
case, we might want to consider using "packed" too:

#define __long_packed_aligned __attribute__((__packed__, __aligned__(__alignof__(long))))

I would just like to know if this attribute causes gcc to emit slower memory
access instructions on ppc/sparc/mips (I remember that at least one of these
emit slower instructions for unaligned read/writes, so I wonder if the compiler
uses them as soon as it sees the "packed" attribute, or if it is more clever
than that).

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 21:40:47 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 10:33:26 -0500
Post by Mathieu Desnoyers
I'm still unsure that __long_long_aligned is needed over __long_aligned though.
AFAIK, the only requirement we have for, e.g. tracepoints, is to align on the
pointer size (sizeof(long)), so RCU pointer updates are performed atomically.
Aligning on the pointer size also allows the architecture to efficiently read
the field content. What does aligning on sizeof(long long) bring to us ? Is it
that you are concerned about the fact that the "aligned" type attribute, when
applied to a structure, is only used as a lower-bound by the compiler ? In that
My concern is that if there is ever a u64 or similarly "long long"
typed member in these tracing structures, it will not be aligned
sufficiently to avoid unaligned access traps on 32-bit systems.

If your suggestion defines the lowest possible alignment and GCC will
do the right thing and "up-align" the structure if necessary, then
fine.

If you add "packed" it is going to screw everything up and we'll
essentially be back to square one.

On RISC like sparc64, "packed" causes even 16-bit words to be read and
written a byte at a time.

Never use "packed" under any circumstances unless absolutely
unavoidable.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-19 22:00:23 UTC
Permalink
Post by David Miller
My concern is that if there is ever a u64 or similarly "long long"
typed member in these tracing structures, it will not be aligned
sufficiently to avoid unaligned access traps on 32-bit systems.
The structure that gets placed in this section is the ftrace_event_call.
It consists only of pointers, a struct list_head, a couple of "int", and
a struct trace_event, which consists of: a struct hlist_node, a struct
list_head, an int, and a pointer.

None of these are more than "long" and I don't foresee them needing a
long long type. I think assuming that a long is the largest item should
due.

We can add a comment next to these structures specifying this
dependency, and hopefully it would be updated if we ever do include a
long long in them.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 22:09:17 UTC
Permalink
From: Steven Rostedt <***@goodmis.org>
Date: Wed, 19 Jan 2011 17:00:23 -0500
Post by Steven Rostedt
We can add a comment next to these structures specifying this
dependency, and hopefully it would be updated if we ever do include a
long long in them.
Yes, I think a huge comment should be placed somewhere and also the
commit message for the final version of this fix should be quite
verbose :-)
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 22:21:44 UTC
Permalink
Post by Steven Rostedt
Post by David Miller
My concern is that if there is ever a u64 or similarly "long long"
typed member in these tracing structures, it will not be aligned
sufficiently to avoid unaligned access traps on 32-bit systems.
The structure that gets placed in this section is the ftrace_event_call.
It consists only of pointers, a struct list_head, a couple of "int", and
a struct trace_event, which consists of: a struct hlist_node, a struct
list_head, an int, and a pointer.
None of these are more than "long" and I don't foresee them needing a
long long type. I think assuming that a long is the largest item should
due.
We can add a comment next to these structures specifying this
dependency, and hopefully it would be updated if we ever do include a
long long in them.
I still wonder how a 32-bit system can generate an unaligned access trap for an
access to a 64-bit variable aligned on 32-bit, given that there is, by
definition, no 64-bit memory accesses available on the architecture ?

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 22:23:48 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 17:21:44 -0500
Post by Mathieu Desnoyers
I still wonder how a 32-bit system can generate an unaligned access trap for an
access to a 64-bit variable aligned on 32-bit, given that there is, by
definition, no 64-bit memory accesses available on the architecture ?
Sparc 32-bit (and I believe MIPS 32-bit as well) have such 64-bit
load and store instructions.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sam Ravnborg
2011-01-19 22:32:34 UTC
Permalink
Post by Mathieu Desnoyers
I still wonder how a 32-bit system can generate an unaligned access trap for an
access to a 64-bit variable aligned on 32-bit, given that there is, by
definition, no 64-bit memory accesses available on the architecture ?
Load/Store Instructions

...
Integer load and store instructions support byte (8-bit), halfword (16-bit), word
(32-bit), and doubleword (64-bit) accesses.
...

Alignment Restrictions

Halfword accesses must be aligned on a 2-byte boundary, word accesses (which
include instruction fetches) must be aligned on a 4-byte boundary, and doubleword
accesses must be aligned on an 8-byte boundary. An improperly aligned
address causes a load or store instruction to generate a mem_address_not_aligned
trap.


Sam
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 22:34:40 UTC
Permalink
Post by Sam Ravnborg
Post by Mathieu Desnoyers
I still wonder how a 32-bit system can generate an unaligned access trap for an
access to a 64-bit variable aligned on 32-bit, given that there is, by
definition, no 64-bit memory accesses available on the architecture ?
Load/Store Instructions
...
Integer load and store instructions support byte (8-bit), halfword (16-bit), word
(32-bit), and doubleword (64-bit) accesses.
...
Alignment Restrictions
Halfword accesses must be aligned on a 2-byte boundary, word accesses (which
include instruction fetches) must be aligned on a 4-byte boundary, and doubleword
accesses must be aligned on an 8-byte boundary. An improperly aligned
address causes a load or store instruction to generate a mem_address_not_aligned
trap.
Ah! There is always an odd case ;) Thanks!

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 22:13:27 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 10:33:26 -0500
Post by Mathieu Desnoyers
I'm still unsure that __long_long_aligned is needed over __long_aligned though.
AFAIK, the only requirement we have for, e.g. tracepoints, is to align on the
pointer size (sizeof(long)), so RCU pointer updates are performed atomically.
Aligning on the pointer size also allows the architecture to efficiently read
the field content. What does aligning on sizeof(long long) bring to us ? Is it
that you are concerned about the fact that the "aligned" type attribute, when
applied to a structure, is only used as a lower-bound by the compiler ? In that
My concern is that if there is ever a u64 or similarly "long long"
typed member in these tracing structures, it will not be aligned
sufficiently to avoid unaligned access traps on 32-bit systems.
Hrm, I'd like to see what kind of ill-conceived 32-bit architecture would
generate a unaligned access for a 32-bit aligned u64. Do you have examples in
mind ? By definition, the memory accesses should be at most 32-bit, no ? AFAIK,
gcc treats u64 as two distinct reads on all 32-bit architectures.
Post by David Miller
If your suggestion defines the lowest possible alignment and GCC will
do the right thing and "up-align" the structure if necessary, then
fine.
Well, I must admit that my assumption is that aligning on the "long" size should
be the only alignment required, both on 32-bit and 64-bit. But I'm curious to
see if there are indeed architectures that break this assumption.

Ideally, I'd like to avoid letting gcc up-align a structure, because it is then
hard to know for sure what the alignment value of the section should be (in the
linker script, we can safely choose 32, but it's more a "safe choice" than
anything else). Moreover, I'm not convinced that gcc will choose to up-align the
structure with the exact same alignment values for both the type declaration and
the variable definition (I'm deeply distrusting gcc to do the right thing here).
Post by David Miller
If you add "packed" it is going to screw everything up and we'll
essentially be back to square one.
On RISC like sparc64, "packed" causes even 16-bit words to be read and
written a byte at a time.
Never use "packed" under any circumstances unless absolutely
unavoidable.
gcc on my sparc64 box (32-bit userland) disagrees with you here ;) Using
gcc (Debian 4.3.3-14) 4.3.3, here is a demonstration that, indeed, "packed"
generates aweful code, but that "packed, aligned(4 or 8)" generates pretty
decent code:

compiling for sparc32:

struct test {
unsigned long a;
unsigned long b;
};

Storing to test "a" field in a main() that returns 0, with -O0:

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 84 10 63 9c or %g1, 0x39c, %g2 ! 2079c <blah>
104fc: 82 10 20 2a mov 0x2a, %g1
10500: c2 20 80 00 st %g1, [ %g2 ]
10504: 82 10 20 00 clr %g1
10508: b0 10 00 01 mov %g1, %i0
1050c: 81 e8 00 00 restore
10510: 81 c3 e0 08 retl
10514: 01 00 00 00 nop

__attribute__((packed))

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 84 10 63 dc or %g1, 0x3dc, %g2 ! 207dc <blah>
104fc: c2 08 80 00 ldub [ %g2 ], %g1
10500: 82 08 60 00 and %g1, 0, %g1
10504: c2 28 80 00 stb %g1, [ %g2 ]
10508: c2 08 a0 01 ldub [ %g2 + 1 ], %g1
1050c: 82 08 60 00 and %g1, 0, %g1
10510: c2 28 a0 01 stb %g1, [ %g2 + 1 ]
10514: c2 08 a0 02 ldub [ %g2 + 2 ], %g1
10518: 82 08 60 00 and %g1, 0, %g1
1051c: c2 28 a0 02 stb %g1, [ %g2 + 2 ]
10520: c2 08 a0 03 ldub [ %g2 + 3 ], %g1
10524: 82 08 60 00 and %g1, 0, %g1
10528: 82 10 60 2a or %g1, 0x2a, %g1
1052c: c2 28 a0 03 stb %g1, [ %g2 + 3 ]
10530: 82 10 20 00 clr %g1
10534: b0 10 00 01 mov %g1, %i0
10538: 81 e8 00 00 restore
1053c: 81 c3 e0 08 retl
10540: 01 00 00 00 nop

__attribute__((packed, aligned(4)))

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 84 10 63 9c or %g1, 0x39c, %g2 ! 2079c <blah>
104fc: 82 10 20 2a mov 0x2a, %g1
10500: c2 20 80 00 st %g1, [ %g2 ]
10504: 82 10 20 00 clr %g1
10508: b0 10 00 01 mov %g1, %i0
1050c: 81 e8 00 00 restore
10510: 81 c3 e0 08 retl
10514: 01 00 00 00 nop

__attribute__((packed, aligned(8)))

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 84 10 63 a0 or %g1, 0x3a0, %g2 ! 207a0 <blah>
104fc: 82 10 20 2a mov 0x2a, %g1
10500: c2 20 80 00 st %g1, [ %g2 ]
10504: 82 10 20 00 clr %g1
10508: b0 10 00 01 mov %g1, %i0
1050c: 81 e8 00 00 restore
10510: 81 c3 e0 08 retl
10514: 01 00 00 00 nop

Now about :

struct test {
unsigned long long a;
unsigned long long b;
};

__attribute__((packed, aligned(8)))
(and without attribute)

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 82 10 63 a0 or %g1, 0x3a0, %g1 ! 207a0 <blah>
104fc: 84 10 20 00 clr %g2
10500: 86 10 20 2a mov 0x2a, %g3
10504: c4 38 40 00 std %g2, [ %g1 ]
10508: 82 10 20 00 clr %g1
1050c: b0 10 00 01 mov %g1, %i0
10510: 81 e8 00 00 restore
10514: 81 c3 e0 08 retl
10518: 01 00 00 00 nop
1051c: 00 00 00 00 illtrap 0

__attribute__((packed, aligned(4)))

000104f0 <main>:
104f0: 9d e3 bf 90 save %sp, -112, %sp
104f4: 03 00 00 81 sethi %hi(0x20400), %g1
104f8: 84 10 63 9c or %g1, 0x39c, %g2 ! 2079c <blah>
104fc: 82 10 20 2a mov 0x2a, %g1
10500: c2 20 a0 04 st %g1, [ %g2 + 4 ]
10504: c0 20 80 00 clr [ %g2 ]
10508: 82 10 20 00 clr %g1
1050c: b0 10 00 01 mov %g1, %i0
10510: 81 e8 00 00 restore
10514: 81 c3 e0 08 retl
10518: 01 00 00 00 nop
1051c: 00 00 00 00 illtrap 0

So the packed, aligned(__alignof__(long)) options does not look that bad.

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 22:21:37 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 17:13:27 -0500
Post by Mathieu Desnoyers
Hrm, I'd like to see what kind of ill-conceived 32-bit architecture would
generate a unaligned access for a 32-bit aligned u64. Do you have examples in
mind ? By definition, the memory accesses should be at most 32-bit, no ? AFAIK,
gcc treats u64 as two distinct reads on all 32-bit architectures.
Sparc 32-bit has 64-bit loads and stores, GCC uses them because the ABI
specifies that every structure is at least 8 byte aligned.
Post by Mathieu Desnoyers
gcc on my sparc64 box (32-bit userland) disagrees with you here ;) Using
gcc (Debian 4.3.3-14) 4.3.3, here is a demonstration that, indeed, "packed"
generates aweful code, but that "packed, aligned(4 or 8)" generates pretty
Amazing, if this works then do it.

But please document this fully with comments and such :-)
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 22:33:39 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 17:13:27 -0500
Post by Mathieu Desnoyers
Hrm, I'd like to see what kind of ill-conceived 32-bit architecture would
generate a unaligned access for a 32-bit aligned u64. Do you have examples in
mind ? By definition, the memory accesses should be at most 32-bit, no ? AFAIK,
gcc treats u64 as two distinct reads on all 32-bit architectures.
Sparc 32-bit has 64-bit loads and stores, GCC uses them because the ABI
specifies that every structure is at least 8 byte aligned.
Ah, that's the answer I was looking for, thanks!
Post by David Miller
Post by Mathieu Desnoyers
gcc on my sparc64 box (32-bit userland) disagrees with you here ;) Using
gcc (Debian 4.3.3-14) 4.3.3, here is a demonstration that, indeed, "packed"
generates aweful code, but that "packed, aligned(4 or 8)" generates pretty
Amazing, if this works then do it.
But please document this fully with comments and such :-)
I will, I will! ;)

So I guess we go for the following. Is it verbose enough ?

/*
* __u64_packed_aligned:
*
* Forces gcc to use the u64 type alignment, up-aligning or down-aligning the
* target type if necessary. The memory accesses to the target structure are
* efficient (does not require bytewise memory accesses) and the atomic pointer
* update guarantees required by RCU are kept. u64 is considered as the largest
* type that can generate a trap for unaligned accesses (u64 on sparc32 needs to
* be aligned on 64-bit).
*
* Specifying both "packed" and "aligned" generates decent code (without the
* bytewise memory accesses generated by simply using "packed"), and forces
* gcc to down-align the structure alignment to the alignment of a u64 type.
*
* This alignment should be used for both structure definitions and declarations
* (as *both* the type and variable attribute) when using the "section"
* attribute to generate arrays of structures.
*/
#define __u64_packed_aligned \
__attribute__((__packed__, __aligned__(__alignof__(long long))))

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-20 00:41:07 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 17:33:39 -0500
Post by Mathieu Desnoyers
So I guess we go for the following. Is it verbose enough ?
It's got all of the details that seem to matter, thanks.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-21 00:04:21 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 17:33:39 -0500
Post by Mathieu Desnoyers
So I guess we go for the following. Is it verbose enough ?
It's got all of the details that seem to matter, thanks.
I'm letting people following this thread and the bug report know that I posted
the patch we discussed about here in the following thread:

https://lkml.org/lkml/2011/1/19/509

Feedback is welcome,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
Richard Mortimer
2011-01-21 18:06:04 UTC
Permalink
Post by Mathieu Desnoyers
Post by David Miller
Date: Wed, 19 Jan 2011 17:33:39 -0500
Post by Mathieu Desnoyers
So I guess we go for the following. Is it verbose enough ?
It's got all of the details that seem to matter, thanks.
I'm letting people following this thread and the bug report know that I posted
https://lkml.org/lkml/2011/1/19/509
Feedback is welcome,
Hi Mathieu,

Thanks for this.

The good news is that I have a 2.6.37 kernel + your patches booting and
it has been running for about an hour on sparc64/Sun Fire V120. I did a
bit of poking around in /debug and turned on/off a bit of tracing and
nothing broke.

I did have a few problems applying your patch series. Your diffs have
semicolons at the end of a couple of the macros that don't appear in the
mainline 2.6.37. One was circa line 174 of include/linux/tracepoint.h
but I don't have the other to hand at the moment and I don't have much
time before I need to go out.

I'm also getting a lot of Kernel unaligned access errors from the
kernel. I don't know if they are related to this or not and this is the
first time that I personally have got 2.6.37 to boot on sparc64. The
messages that I am getting seem to be repeats of

[ 4376.807811] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.807908] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808044] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808871] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.808965] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4381.813354] log_unaligned: 337 callbacks suppressed

I have to go out now but will be around later/over the weekend.

Regards

Richard
Mathieu Desnoyers
2011-01-21 18:52:59 UTC
Permalink
Post by Richard Mortimer
Post by Mathieu Desnoyers
Post by David Miller
Date: Wed, 19 Jan 2011 17:33:39 -0500
Post by Mathieu Desnoyers
So I guess we go for the following. Is it verbose enough ?
It's got all of the details that seem to matter, thanks.
I'm letting people following this thread and the bug report know that I posted
https://lkml.org/lkml/2011/1/19/509
Feedback is welcome,
Hi Richard,
Post by Richard Mortimer
Hi Mathieu,
Thanks for this.
The good news is that I have a 2.6.37 kernel + your patches booting and
it has been running for about an hour on sparc64/Sun Fire V120. I did a
bit of poking around in /debug and turned on/off a bit of tracing and
nothing broke.
That's a good start :)
Post by Richard Mortimer
I did have a few problems applying your patch series. Your diffs have
semicolons at the end of a couple of the macros that don't appear in the
mainline 2.6.37. One was circa line 174 of include/linux/tracepoint.h
but I don't have the other to hand at the moment and I don't have much
time before I need to go out.
I know what's causing this, I'll rebase my patch before my next post (I was
working at the tail of my lttng patch queue). Thanks for letting me know.
Post by Richard Mortimer
I'm also getting a lot of Kernel unaligned access errors from the
kernel. I don't know if they are related to this or not and this is the
first time that I personally have got 2.6.37 to boot on sparc64. The
messages that I am getting seem to be repeats of
[ 4376.807811] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.807908] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808044] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808871] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.808965] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4381.813354] log_unaligned: 337 callbacks suppressed
I have to go out now but will be around later/over the weekend.
Can you send me your .config ? I'd really like to see which of tracepoint/static
jump patching are enabled.

Do you get this message as soon as the system boots, or only when you enable
tracing ?

Thanks,

Mathieu
Post by Richard Mortimer
Regards
Richard
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-21 19:15:56 UTC
Permalink
[...]
Post by Richard Mortimer
I'm also getting a lot of Kernel unaligned access errors from the
kernel. I don't know if they are related to this or not and this is the
first time that I personally have got 2.6.37 to boot on sparc64. The
messages that I am getting seem to be repeats of
[ 4376.807811] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.807908] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808044] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808871] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.808965] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4381.813354] log_unaligned: 337 callbacks suppressed
I have to go out now but will be around later/over the weekend.
OK, I pinpointed it to my use of the "packed" attribute. So within the
structure:

struct test {
const char *a;
int b;
void *c;
void *d;
void *e;
} __attribute__((packed, aligned(8)));

(on sparc64)

It provides the following offsets:

0 8 12 20 28

which is clearly wrong in terms of inner alignment: it removes the padding
between b and c. I am really tempted to just remove the "packed" attribute from
there: our goal is really to make sure the 8-byte accesses are all aligned after
all. So theoretically gcc could decide to align all struct test arrays and
pointers on an alignment larger than 8 if we just specify the aligned(8) type
attribute (because the type attribute is just a minimum floor alignment value),
but the only reason I would see for gcc to align these on larger alignment would
be that the structures would contain a field that requires such largish
alignment (which I doubt we have in the kernel).

I'll prepare updated patches shortly.

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-21 20:14:11 UTC
Permalink
...
Post by Mathieu Desnoyers
Post by Richard Mortimer
I'm also getting a lot of Kernel unaligned access errors from the
kernel. I don't know if they are related to this or not and this is the
first time that I personally have got 2.6.37 to boot on sparc64. The
messages that I am getting seem to be repeats of
[ 4376.807811] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.807908] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808044] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4376.808871] Kernel unaligned access at TPC[456e94]
try_to_wake_up+0x58/0xec
[ 4376.808965] Kernel unaligned access at TPC[75541c] schedule+0x454/0x660
[ 4381.813354] log_unaligned: 337 callbacks suppressed
I have to go out now but will be around later/over the weekend.
Can you send me your .config ? I'd really like to see which of tracepoint/static
jump patching are enabled.
It is basically a clone of the Debian 2.6.37 experimental kernel
.config. It has been through a couple of make oldconfig cycles with
different kernel versions so it should be pretty similar to what the
Debian folks intend. Anyhow if it is still useful you can download it from

http://bridge.oldelvet.org.uk/download/110121_linux-2.6.37_sparc64.config
Post by Mathieu Desnoyers
Do you get this message as soon as the system boots, or only when you enable
tracing ?
Good question. Looking back through the logs it started about 14 minutes
after I booted. I don't remember how that relates to me starting to play
around with tracing but I'm guessing that you have a good idea what is
happening.

I've managed to grab disassembly of the two locations that are reporting
problems just in case that helps.

0000000000456e3c <try_to_wake_up>:
456e3c: 9d e3 bf 50 save %sp, -176, %sp
456e40: a5 52 00 00 rdpr %pil, %l2
456e44: 82 14 a0 0e or %l2, 0xe, %g1
456e48: 91 90 60 00 wrpr %g1, 0, %pil
456e4c: c2 5e 00 00 ldx [ %i0 ], %g1
456e50: b2 0e 40 01 and %i1, %g1, %i1
456e54: 02 ce 40 2b brz %i1, 456f00 <try_to_wake_up+0xc4>
456e58: a2 10 20 00 clr %l1
456e5c: c2 06 20 70 ld [ %i0 + 0x70 ], %g1
456e60: 80 a0 60 00 cmp %g1, 0
456e64: 12 48 00 07 bne %icc, 456e80 <try_to_wake_up+0x44>
456e68: 03 00 22 a5 sethi %hi(0x8a9400), %g1
456e6c: 90 10 00 18 mov %i0, %o0
456e70: 7f ff ff dd call 456de4 <T.1233>
456e74: 92 10 20 01 mov 1, %o1
456e78: a2 10 20 01 mov 1, %l1
456e7c: 03 00 22 a5 sethi %hi(0x8a9400), %g1
456e80: a6 10 00 11 mov %l1, %l3
456e84: c4 00 62 80 ld [ %g1 + 0x280 ], %g2
456e88: 80 a0 a0 00 cmp %g2, 0
456e8c: 02 48 00 0f be %icc, 456ec8 <try_to_wake_up+0x8c>
456e90: a8 0c 60 ff and %l1, 0xff, %l4
456e94: e0 58 62 94 ldx [ %g1 + 0x294 ], %l0
456e98: 02 c4 00 0d brz,pn %l0, 456ecc <try_to_wake_up+0x90>
456e9c: 11 00 21 93 sethi %hi(0x864c00), %o0
456ea0: a9 3d 20 00 sra %l4, 0, %l4
456ea4: c2 5c 00 00 ldx [ %l0 ], %g1
456ea8: 92 10 00 18 mov %i0, %o1
456eac: 94 10 00 14 mov %l4, %o2
456eb0: d0 5c 20 08 ldx [ %l0 + 8 ], %o0
456eb4: 9f c0 40 00 call %g1
456eb8: a0 04 20 10 add %l0, 0x10, %l0
456ebc: c2 5c 00 00 ldx [ %l0 ], %g1


...
7553f8: 81 58 00 00 flushw
7553fc: 05 00 22 a5 sethi %hi(0x8a9400), %g2
755400: 84 10 a2 c8 or %g2, 0x2c8, %g2 ! 8a96c8
<__tracepoint_sched_switch>
755404: c2 00 a0 08 ld [ %g2 + 8 ], %g1
755408: 80 a0 60 00 cmp %g1, 0
75540c: 22 48 00 11 be,a %icc, 755450 <schedule+0x488>
755410: e4 5c 21 50 ldx [ %l0 + 0x150 ], %l2
755414: 07 00 22 a5 sethi %hi(0x8a9400), %g3
755418: 86 10 e2 e4 or %g3, 0x2e4, %g3 ! 8a96e4
<__tracepoint_sched_switch+0x1c>
75541c: e4 58 c0 00 ldx [ %g3 ], %l2
755420: 22 c4 80 0c brz,a,pn %l2, 755450 <schedule+0x488>
755424: e4 5c 21 50 ldx [ %l0 + 0x150 ], %l2
755428: c2 5c 80 00 ldx [ %l2 ], %g1

Regards

Richard

P.S. I saw your followup mail so hopefully this matches what you have found!
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-21 20:40:33 UTC
Permalink
* Richard Mortimer (***@oldelvet.org.uk) wrote:
[...]
Post by Richard Mortimer
P.S. I saw your followup mail so hopefully this matches what you have found!
Thanks for the info! At first glance, it does not seem to contradict my
findings. When you find time, can you have a try at v3 I just posted ?

Make sure to start tracing in your tests, as I think the unaligned access only
happens when accessing the struct tracepoint fields below the "int state" field.

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
Richard Mortimer
2011-01-21 22:50:49 UTC
Permalink
Post by Mathieu Desnoyers
[...]
Post by Richard Mortimer
P.S. I saw your followup mail so hopefully this matches what you have found!
Thanks for the info! At first glance, it does not seem to contradict my
findings. When you find time, can you have a try at v3 I just posted ?
I'm setting the compilation off now. I will give it a whirl some time
Saturday or Sunday.

Before I did that I created kernel/sched.s and have verified that the
two error locations are accesses to it_func_ptr as you suspected.


ldx [%g1+%lo(__tracepoint_sched_wakeup)+28], %l0 !, it_func_ptr
and
ldx [%g3], %l2 !, it_func_ptr

with v3 of you patches applied it uses +32 to get the offset of
it_func_ptr so that looks good.
Post by Mathieu Desnoyers
Make sure to start tracing in your tests, as I think the unaligned access only
happens when accessing the struct tracepoint fields below the "int state" field.
Will do.

Regards

Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-22 18:42:33 UTC
Permalink
Post by Richard Mortimer
Post by Mathieu Desnoyers
Thanks for the info! At first glance, it does not seem to contradict my
findings. When you find time, can you have a try at v3 I just posted ?
I'm setting the compilation off now. I will give it a whirl some time
Saturday or Sunday.
...
Post by Richard Mortimer
Post by Mathieu Desnoyers
Make sure to start tracing in your tests, as I think the unaligned access only
happens when accessing the struct tracepoint fields below the "int state" field.
I have the revised patch running now with tracing enabled. Everything
looks good to me.

Thanks.

Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-22 18:53:26 UTC
Permalink
Post by Richard Mortimer
Post by Richard Mortimer
Post by Mathieu Desnoyers
Thanks for the info! At first glance, it does not seem to contradict my
findings. When you find time, can you have a try at v3 I just posted ?
I'm setting the compilation off now. I will give it a whirl some time
Saturday or Sunday.
...
Post by Richard Mortimer
Post by Mathieu Desnoyers
Make sure to start tracing in your tests, as I think the unaligned access only
happens when accessing the struct tracepoint fields below the "int state" field.
I have the revised patch running now with tracing enabled. Everything
looks good to me.
Thanks for testing it!

Mathieu
Post by Richard Mortimer
Thanks.
Richard
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-19 15:46:00 UTC
Permalink
After applying David's "remove align" patch, I got it to boot on x86_64
with the following two patches. I thought just adding the "align" to the
structure declaration would work, but it still failed on the syscall for
init_module. By removing the double "declaration" of event_exit_##sname,
removed this problem.

I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
of them out for 38. Should these go to 37 stable too?

-- Steve

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index cacc27a..0e3bce6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -142,8 +142,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
#define SYSCALL_TRACE_EXIT_EVENT(sname) \
static struct syscall_metadata \
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
- static struct ftrace_event_call \
- __attribute__((__aligned__(4))) event_exit_##sname; \
static struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \


Index: linux-trace.git/include/linux/ftrace_event.h
===================================================================
--- linux-trace.git.orig/include/linux/ftrace_event.h
+++ linux-trace.git/include/linux/ftrace_event.h
@@ -194,7 +194,7 @@ struct ftrace_event_call {
int perf_refcount;
struct hlist_head __percpu *perf_events;
#endif
-};
+} __attribute__((__aligned__(32)));

#define PERF_MAX_TRACE_SIZE 2048



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 16:15:34 UTC
Permalink
Post by Steven Rostedt
After applying David's "remove align" patch, I got it to boot on x86_64
with the following two patches. I thought just adding the "align" to the
structure declaration would work, but it still failed on the syscall for
init_module. By removing the double "declaration" of event_exit_##sname,
removed this problem.
I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
of them out for 38. Should these go to 37 stable too?
Please hold before adding these patches into git. They don't seem to address the
underlying problem correctly. See the latest exchanges between David Miller and
myself for more info.

We need to come up with something better than "it boots" as an explanation for
the fix.

Thanks,

Mathieu
Post by Steven Rostedt
-- Steve
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index cacc27a..0e3bce6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -142,8 +142,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
#define SYSCALL_TRACE_EXIT_EVENT(sname) \
static struct syscall_metadata \
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
- static struct ftrace_event_call \
- __attribute__((__aligned__(4))) event_exit_##sname; \
static struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
__attribute__((section("_ftrace_events"))) \
Index: linux-trace.git/include/linux/ftrace_event.h
===================================================================
--- linux-trace.git.orig/include/linux/ftrace_event.h
+++ linux-trace.git/include/linux/ftrace_event.h
@@ -194,7 +194,7 @@ struct ftrace_event_call {
int perf_refcount;
struct hlist_head __percpu *perf_events;
#endif
-};
+} __attribute__((__aligned__(32)));
#define PERF_MAX_TRACE_SIZE 2048
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
Steven Rostedt
2011-01-19 18:13:42 UTC
Permalink
Post by Mathieu Desnoyers
Post by Steven Rostedt
After applying David's "remove align" patch, I got it to boot on x86_64
with the following two patches. I thought just adding the "align" to the
structure declaration would work, but it still failed on the syscall for
init_module. By removing the double "declaration" of event_exit_##sname,
removed this problem.
I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
of them out for 38. Should these go to 37 stable too?
Please hold before adding these patches into git. They don't seem to address the
underlying problem correctly. See the latest exchanges between David Miller and
myself for more info.
We need to come up with something better than "it boots" as an explanation for
the fix.
Yes, I agree that we should solve this issue correctly. But if there is
a work around to the problem, we could implement that if the real
solution is not in our grasp yet.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 18:20:53 UTC
Permalink
Post by Steven Rostedt
Post by Mathieu Desnoyers
Post by Steven Rostedt
After applying David's "remove align" patch, I got it to boot on x86_64
with the following two patches. I thought just adding the "align" to the
structure declaration would work, but it still failed on the syscall for
init_module. By removing the double "declaration" of event_exit_##sname,
removed this problem.
I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
of them out for 38. Should these go to 37 stable too?
Please hold before adding these patches into git. They don't seem to address the
underlying problem correctly. See the latest exchanges between David Miller and
myself for more info.
We need to come up with something better than "it boots" as an explanation for
the fix.
Yes, I agree that we should solve this issue correctly. But if there is
a work around to the problem, we could implement that if the real
solution is not in our grasp yet.
A known working workaround (used in tracepoints for a few years) is to align the
type declaration on 32 bytes. It wastes space, but works. With this solution,
you should remove all the per-variable alignment attributes.

Now what I'm discussing with David Miller is if creating a

__long_packed_aligned

and using it for *both* type and variable alignment would be more palatable (it
also works, and is more compact).

David proposed a solution with an array of pointers (extra indirection) which I
don't really like for 3 reasons I exposed in my reply to him.

So it's not that the solution is not in our grasp yet, it's more that we have to
choose the right one.

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 21:44:28 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 13:20:53 -0500
Post by Mathieu Desnoyers
Now what I'm discussing with David Miller is if creating a
__long_packed_aligned
and using it for *both* type and variable alignment would be more palatable (it
also works, and is more compact).
As I mentioned in another reply, we should not be using packed.

Packed has other implications, which makes it use byte-at-a-time accesses
for all parts of a structure when you tag it with 'packed'. GCC doesn't
try to be clever and see that actually such accesses are safe.

If plain "__long_aligned" works and, since you're tagging it to the structure
definition, it only specifies a minimum-alignment, then I'm fine with using
that to fix this.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 22:15:38 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 13:20:53 -0500
Post by Mathieu Desnoyers
Now what I'm discussing with David Miller is if creating a
__long_packed_aligned
and using it for *both* type and variable alignment would be more palatable (it
also works, and is more compact).
As I mentioned in another reply, we should not be using packed.
Packed has other implications, which makes it use byte-at-a-time accesses
for all parts of a structure when you tag it with 'packed'. GCC doesn't
try to be clever and see that actually such accesses are safe.
Please see my explanation about the difference between "packed" and "packed,
aligned(4 or 8)" in the other thread. I share your concern about ugly code
generated by "packed", but my tests with a sparc32 gcc show that using both
packed and aligned attributes generate nice code.
Post by David Miller
If plain "__long_aligned" works and, since you're tagging it to the structure
definition, it only specifies a minimum-alignment, then I'm fine with using
that to fix this.
I'd prefer to add the "packed" to ensure that gcc never decides for some odd
reason to use an alignment larger than the one we specify in the linker script.

Thanks,

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 22:22:04 UTC
Permalink
From: Mathieu Desnoyers <***@efficios.com>
Date: Wed, 19 Jan 2011 17:15:38 -0500
Post by Mathieu Desnoyers
Post by David Miller
If plain "__long_aligned" works and, since you're tagging it to the structure
definition, it only specifies a minimum-alignment, then I'm fine with using
that to fix this.
I'd prefer to add the "packed" to ensure that gcc never decides for some odd
reason to use an alignment larger than the one we specify in the linker script.
Agreed.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers
2011-01-19 15:11:39 UTC
Permalink
Post by David Miller
Date: Wed, 19 Jan 2011 00:08:45 -0500
Post by Mathieu Desnoyers
- No aligned() type attribute nor variable attribute. I get a crash on x86_64
(NULL pointer exception when executing __trace_add_event_call, the 5th call).
__alignof__(struct ftrace_event_call) is worth 8.
I think I figured it out.
It's the static vs. non-static thing, or some other crazyness wrt.
how x86-64 implements it's alignment rules.
GCC on x86-64 uses a completely different policy for aligning local
(ie. "static") data objects vs. globally visible ones, for one thing.
It also has different alignment policies for objects that are part
of an array vs. those which are not.
On both counts, we're lying to the compiler, so maybe it's sort of our
fault.
Yep, we're in strong agreement here.
Post by David Miller
As far as GCC can see, the object is static and also not part of an
array or any other C construct for which things like this could matter
as long as the alignment it chooses meets the minimum alignment
requirements of the ABI.
So it doesn't let us do this trick where we put the individual event
markers into a special section, yet mark it __used and static, then
later access them as if they were part of a globally visible array.
If you look these static objects, they are emitted with a leading
".align 32" directive. This is what screws everything up.
Ah, yep, good way to identify the culprit.
Post by David Miller
When the linker sees this, it aligns the start of every individual
"_ftrace_events" section, and that's where the "gaps" come from and
the crashes.
OK (more to come in the following email response).

Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
Richard Mortimer
2011-01-19 15:27:17 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 22:00:39 -0800 (PST)
Post by David Miller
ftrace: Remove unnecessary alignment tag from ftrace_event_call.
It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.
...
Ok, unless we can explain why these alignments are needed at all, we
--------------------
tracing: Remove unnecessary __aligned__(4) tag from trace data.
It's completely unnecessary and causes problems on platforms
where this tag down-aligns the structure's alignment.
For reference my build with the above patch completed last night. That
does boot on my sparc64 Sun Fire V120 so at the very least the
relocation issues go away.

The machine wasn't very happy and networking/nfs was a little upset but
I'll wait until the patchset is more refined before investigating
further and trying to work out if it is related or not.

Regards

Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-17 06:07:55 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)

[ Please, everyone, retain the full CC: on all replies, thanks. Some
people are replying only into the debian bug alias, and that loses
information and exposure for fixing this bug. ]
Post by David Miller
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
I did some more research.

Although I've seen commentary to the contrary, in fact using a too-small
__attribute__((aligned())) directive will lower the alignment of data
members, and yes that means it will lower the alignemnt to be below the
natural and required alignment for the given type.

So if you have, on 64-bit:

struct foo {
void *bar;
};

static struct foo test __attribute__((__aligned__(4)));

The compiler will emit "test" with 4-byte alignment into the data
section, even though 8-byte alignment is required for "test.bar"

Assuming we wanted that to actually happen, the GCC manual is very
explicit to state that in order for this to work, such down-aligned
data structures must also use the "packed" attribute.

I think we want none of this, and I think we should elide the align
directives entirely, or at least fix them so we don't get unaligned
stuff on 64-bit.

Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.

Oh, this is some CRIS thing, because it only byte aligns. See:

commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b
Author: Jesper Nilsson <***@axis.com>
Date: Wed Jan 14 11:19:08 2009 +0100

klist.c: bit 0 in pointer can't be used as flag

That's where the klist one comes from.

The ftrace ones come from:

commit 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5
Author: Jeff Mahoney <***@suse.com>
Date: Wed Feb 24 13:59:23 2010 -0500

tracing: Fix ftrace_event_call alignment for use with gcc 4.5

We really can't handle this that way, it's going to break stuff
on 64-bit systems at the very least.

How about we use __BIGGEST_ALIGNMENT__ or something arch-defined value
instead?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jesper Nilsson
2011-01-17 09:05:57 UTC
Permalink
Post by David Miller
Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)
[ Please, everyone, retain the full CC: on all replies, thanks. Some
people are replying only into the debian bug alias, and that loses
information and exposure for fixing this bug. ]
Post by David Miller
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
I did some more research.
Although I've seen commentary to the contrary, in fact using a too-small
__attribute__((aligned())) directive will lower the alignment of data
members, and yes that means it will lower the alignemnt to be below the
natural and required alignment for the given type.
struct foo {
void *bar;
};
static struct foo test __attribute__((__aligned__(4)));
The compiler will emit "test" with 4-byte alignment into the data
section, even though 8-byte alignment is required for "test.bar"
Assuming we wanted that to actually happen, the GCC manual is very
explicit to state that in order for this to work, such down-aligned
data structures must also use the "packed" attribute.
I think we want none of this, and I think we should elide the align
directives entirely, or at least fix them so we don't get unaligned
stuff on 64-bit.
Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.
commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b
Date: Wed Jan 14 11:19:08 2009 +0100
klist.c: bit 0 in pointer can't be used as flag
That's where the klist one comes from.
Yup, this one could instead be solved by introducing a "flags" field
in the struct, but that was considered a too large impact fix.
Post by David Miller
commit 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5
Date: Wed Feb 24 13:59:23 2010 -0500
tracing: Fix ftrace_event_call alignment for use with gcc 4.5
We really can't handle this that way, it's going to break stuff
on 64-bit systems at the very least.
How about we use __BIGGEST_ALIGNMENT__ or something arch-defined value
instead?
From CRIS-standpoint that would be fine.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- ***@axis.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-02-01 05:11:16 UTC
Permalink
From: Jesper Nilsson <***@axis.com>
Date: Mon, 17 Jan 2011 10:05:57 +0100
Post by Jesper Nilsson
Post by David Miller
Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.
commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b
Date: Wed Jan 14 11:19:08 2009 +0100
klist.c: bit 0 in pointer can't be used as flag
That's where the klist one comes from.
Yup, this one could instead be solved by introducing a "flags" field
in the struct, but that was considered a too large impact fix.
Jesper, could you please review this?

--------------------
klist: Fix object alignment on 64-bit.

Commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b ("klist.c: bit 0 in
pointer can't be used as flag") intended to make sure that all klist
objects were at least pointer size aligned, but used the constant "4"
which only works on 32-bit.

Use "sizeof(void *)" which is correct in all cases.

Signed-off-by: David S. Miller <***@davemloft.net>

diff --git a/include/linux/klist.h b/include/linux/klist.h
index e91a4e5..a370ce5 100644
--- a/include/linux/klist.h
+++ b/include/linux/klist.h
@@ -22,7 +22,7 @@ struct klist {
struct list_head k_list;
void (*get)(struct klist_node *);
void (*put)(struct klist_node *);
-} __attribute__ ((aligned (4)));
+} __attribute__ ((aligned (sizeof(void *))));

#define KLIST_INIT(_name, _get, _put) \
{ .k_lock = __SPIN_LOCK_UNLOCKED(_name.k_lock), \
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jesper Nilsson
2011-02-01 10:03:05 UTC
Permalink
Post by David Miller
Jesper, could you please review this?
Looks good!
Post by David Miller
--------------------
klist: Fix object alignment on 64-bit.
Commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b ("klist.c: bit 0 in
pointer can't be used as flag") intended to make sure that all klist
objects were at least pointer size aligned, but used the constant "4"
which only works on 32-bit.
Use "sizeof(void *)" which is correct in all cases.
diff --git a/include/linux/klist.h b/include/linux/klist.h
index e91a4e5..a370ce5 100644
--- a/include/linux/klist.h
+++ b/include/linux/klist.h
@@ -22,7 +22,7 @@ struct klist {
struct list_head k_list;
void (*get)(struct klist_node *);
void (*put)(struct klist_node *);
-} __attribute__ ((aligned (4)));
+} __attribute__ ((aligned (sizeof(void *))));
#define KLIST_INIT(_name, _get, _put) \
{ .k_lock = __SPIN_LOCK_UNLOCKED(_name.k_lock), \
/^JN - Jesper Nilsson
--
Jesper Nilsson -- ***@axis.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-17 10:22:41 UTC
Permalink
Post by David Miller
Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)
Post by David Miller
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
I noticed another potentially 64 bit unfriendly alignment on struct
tracepoint in include/linux/tracepoint.h. I don't think that the
alignment of 32 breaks anything but it does leave a 24 byte hole. I
don't know enough about tracing to know if that is necessary.

struct tracepoint {
const char *name; /* Tracepoint name */
int state; /* State. */
void (*regfunc)(void);
void (*unregfunc)(void);
struct tracepoint_func *funcs;
} __attribute__((aligned(32))); /*
* Aligned on 32 bytes because it is
* globally visible and gcc happily
* align these on the structure size.
* Keep in sync with vmlinux.lds.h.
*/

Note I spotted this when looking at some residual sparc64 relocation
issues when _ftrace_events alignment is changed to 8. I'll follow those
issues up in a separate email when I get time later today.

Regards

Richard


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-17 14:15:41 UTC
Permalink
Post by Richard Mortimer
Post by David Miller
Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)
Post by David Miller
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
I noticed another potentially 64 bit unfriendly alignment on struct
tracepoint in include/linux/tracepoint.h. I don't think that the
alignment of 32 breaks anything but it does leave a 24 byte hole. I
don't know enough about tracing to know if that is necessary.
struct tracepoint {
const char *name; /* Tracepoint name */
int state; /* State. */
void (*regfunc)(void);
void (*unregfunc)(void);
struct tracepoint_func *funcs;
} __attribute__((aligned(32))); /*
* Aligned on 32 bytes because it is
* globally visible and gcc happily
* align these on the structure size.
* Keep in sync with vmlinux.lds.h.
*/
Note I spotted this when looking at some residual sparc64 relocation
issues when _ftrace_events alignment is changed to 8. I'll follow those
issues up in a separate email when I get time later today.
Again, this is to help the linker keep arrays in tacked. Tracepoints are
allocated into the tracepoint section, and then read like an array. If
the linker adds holes as it links sections into one big one, then the
reading of the array breaks.

We either need to compact it (with the align(4)) which is undesirable,
or add our own holes like the above does.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:35:13 UTC
Permalink
From: Steven Rostedt <***@goodmis.org>
Date: Mon, 17 Jan 2011 09:15:41 -0500
Post by Steven Rostedt
Again, this is to help the linker keep arrays in tacked. Tracepoints are
allocated into the tracepoint section, and then read like an array. If
the linker adds holes as it links sections into one big one, then the
reading of the array breaks.
We either need to compact it (with the align(4)) which is undesirable,
or add our own holes like the above does.
Just take away all of the align directives, it should just work.

If it doesn't, then we should investigate to find out the real reason
why.

The linker has no reason to add holes, and in fact if we are to believe
the commit message for 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5
("tracing: Fix ftrace_event_call alignment for use with gcc 4.5"), with
gcc-4.x where x < 5, it did work even though some ftrace_event_call
declarations lacked the align directive.

If this align thing is so critical, why don't we need it for the
exception table entries which live in the "__ex_table" section in the
kernel? It's the same _exact_ kind of thing, and the asm sequence we
use to populate it is identical to what GCC emits for the tracing
object declarations in question.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 17:30:56 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 09:15:41 -0500
Post by Steven Rostedt
Again, this is to help the linker keep arrays in tacked. Tracepoints are
allocated into the tracepoint section, and then read like an array. If
the linker adds holes as it links sections into one big one, then the
reading of the array breaks.
We either need to compact it (with the align(4)) which is undesirable,
or add our own holes like the above does.
Just take away all of the align directives, it should just work.
If that was the case, we would have never added it :-)
Post by David Miller
If it doesn't, then we should investigate to find out the real reason
why.
OK, we can remove it and I can see what breaks.
Post by David Miller
The linker has no reason to add holes, and in fact if we are to believe
the commit message for 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5
("tracing: Fix ftrace_event_call alignment for use with gcc 4.5"), with
gcc-4.x where x < 5, it did work even though some ftrace_event_call
declarations lacked the align directive.
If this align thing is so critical, why don't we need it for the
exception table entries which live in the "__ex_table" section in the
kernel? It's the same _exact_ kind of thing, and the asm sequence we
use to populate it is identical to what GCC emits for the tracing
object declarations in question.
But aren't the __ex_table entries just two words? Which would align
nicely regardless.

The ftrace_event_call is a relatively large structure, as it may end on
a 4 byte alignement, the linker may start the next section with more
space to get it back to a 8 byte alignment. This is assuming that x86_64
packs the array in 4 bytes.

Hmm, perhaps changing the alignment in vmlinux.lds.h would fix that.

Anyway, I'll revert that commit and see if I can break it again. If so,
then I'll look for other solutions.

Thanks!

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-17 19:46:21 UTC
Permalink
Post by Richard Mortimer
Post by David Miller
Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)
Post by David Miller
I think the problem we have here is that the _ftrace_events section is
not aligned sufficiently. That ".align 4" mnemonic is a good indication
of this. It should at least "8" on sparc64.
I noticed another potentially 64 bit unfriendly alignment on struct
tracepoint in include/linux/tracepoint.h. I don't think that the
alignment of 32 breaks anything but it does leave a 24 byte hole. I
don't know enough about tracing to know if that is necessary.
struct tracepoint {
const char *name; /* Tracepoint name */
int state; /* State. */
void (*regfunc)(void);
void (*unregfunc)(void);
struct tracepoint_func *funcs;
} __attribute__((aligned(32))); /*
* Aligned on 32 bytes because it is
* globally visible and gcc happily
* align these on the structure size.
* Keep in sync with vmlinux.lds.h.
*/
Note I spotted this when looking at some residual sparc64 relocation
issues when _ftrace_events alignment is changed to 8. I'll follow those
issues up in a separate email when I get time later today.
I did some experiments looking at what happens when the
include/trace/ftrace.h __aligned__(4) usages for _ftrace_events are
changed to __aligned__(8). This causes the R_SPARC_UA64 relocations to
go to R_SPARC_64 and gets rid of that particular issue.

However it does not cause the R_SPARC_13 relocation records to go away.
They still persist and the ones I've looked at are be related to uses of
struct tracepoint that I mentioned earlier. I tried various different
values of alignment for both __ftrace_events and struct tracepoint but
nothing made the uses of R_SPARC_13 go away.

As an example from drivers/scsi/scsi_error.c function scsi_eh_wakeup().

This has relocation records of

0000000000002be0 R_SPARC_HI22 __tracepoint_scsi_eh_wakeup
0000000000002be4 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
0000000000002be4 R_SPARC_13 *ABS*+0x0000000000000008
0000000000002bf4 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
0000000000002bf4 R_SPARC_13 *ABS*+0x0000000000000020

for the following assembly code

sethi %hi(__tracepoint_scsi_eh_wakeup), %g1 !, tmp135
lduw [%g1+%lo(__tracepoint_scsi_eh_wakeup)+8], %g2 ! __tracepoint_scsi_eh_wakeup.state,
cmp %g2, 0 ! __tracepoint_scsi_eh_wakeup.state,
be,pt %icc, .LL454
nop !
ldx [%g1+%lo(__tracepoint_scsi_eh_wakeup)+32], %l0 !, it_func_ptr
brz,pn %l0, .LL454 ! it_func_ptr,

this corresponds to the first few lines of

void scsi_eh_wakeup(struct Scsi_Host *shost)
{
if (shost->host_busy == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
SCSI_LOG_ERROR_RECOVERY(5,
printk("Waking error handler thread\n"));
}
}


If I try compiling with the -Os option removed from KBUILD_CFLAGS it
produces a more traditional R_SPARC_HI22 and R_SPARC_LO10 output as
shown below. So maybe there is a really need to add R_SPARC_13 support
to the sparc64 module loader to allow for the optimizations that the
compiler is making now.

00000000000001bc R_SPARC_HI22 __tracepoint_scsi_eh_wakeup
00000000000001c0 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
00000000000001dc R_SPARC_HI22 __tracepoint_scsi_eh_wakeup+0x0000000000000020
00000000000001e0 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup+0x0000000000000020


ldx [%fp+2175], %g1 ! shost, tmp123
stx %g1, [%fp+2007] ! tmp123, shost
sethi %hi(__tracepoint_scsi_eh_wakeup), %g1 !, tmp125
or %g1, %lo(__tracepoint_scsi_eh_wakeup), %g1 ! tmp125,, tmp124
ld [%g1+8], %g1 ! __tracepoint_scsi_eh_wakeup.state, D.32824
xor %g1, 0, %g1 ! D.32824,, tmp126
subcc %g0, %g1, %g0 !, tmp126
addx %g0, 0, %g1 !,, D.32823
brz %g1, .LL3
nop ! D.32822,
sethi %hi(__tracepoint_scsi_eh_wakeup+32), %g1 !, tmp127
or %g1, %lo(__tracepoint_scsi_eh_wakeup+32), %g1 ! tmp127,, D.32820
ldx [%g1], %g1 !* D.32820, tmp128
stx %g1, [%fp+2015] ! tmp128, _________p1

Regards

Richard

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-17 21:02:38 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Mon, 17 Jan 2011 19:46:21 +0000
Post by Richard Mortimer
As an example from drivers/scsi/scsi_error.c function scsi_eh_wakeup().
This has relocation records of
...
Post by Richard Mortimer
0000000000002be4 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
0000000000002be4 R_SPARC_13 *ABS*+0x0000000000000008
...
Post by Richard Mortimer
lduw [%g1+%lo(__tracepoint_scsi_eh_wakeup)+8], %g2 ! __tracepoint_scsi_eh_wakeup.state,
In a final object, the binutils linker should be using one
R_SPARC_OLO10 relocation for this kind of expression on sparc64. Not
the two relocations on the same instruction it appears to be using
here. I think you're looking at an object output by the assembler
and not the finally linked module.

You should also be careful about which objects you are analyzing. You
should be looking at the finally linked "foo.ko" file, not the
individual "foo.o" objects, as the majority of the relocations go away
when the linker puts together the final module.

Is that what you're doing?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-17 23:34:03 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 19:46:21 +0000
Post by Richard Mortimer
As an example from drivers/scsi/scsi_error.c function scsi_eh_wakeup().
This has relocation records of
...
Post by Richard Mortimer
0000000000002be4 R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
0000000000002be4 R_SPARC_13 *ABS*+0x0000000000000008
...
Post by Richard Mortimer
lduw [%g1+%lo(__tracepoint_scsi_eh_wakeup)+8], %g2 ! __tracepoint_scsi_eh_wakeup.state,
In a final object, the binutils linker should be using one
R_SPARC_OLO10 relocation for this kind of expression on sparc64. Not
the two relocations on the same instruction it appears to be using
here. I think you're looking at an object output by the assembler
and not the finally linked module.
You should also be careful about which objects you are analyzing. You
should be looking at the finally linked "foo.ko" file, not the
individual "foo.o" objects, as the majority of the relocations go away
when the linker puts together the final module.
Is that what you're doing?
Yes in this instance I was/am. Thanks for the explanation.

However the same R_SPARC_13 also exists in scsi_mod.ko. It exists in the
original Debian 2.6.37-trunk-sparc64 version and in my current build of
the same with the 8 byte alignment for _trace_events.


00000000000074a8 R_SPARC_HI22 __tracepoint_scsi_eh_wakeup
00000000000074ac R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
00000000000074ac R_SPARC_13 *ABS*+0x0000000000000008
00000000000074bc R_SPARC_LO10 __tracepoint_scsi_eh_wakeup
00000000000074bc R_SPARC_13 *ABS*+0x0000000000000020

74a8: 03 00 00 00 sethi %hi(0), %g1
74ac: c4 00 60 00 ld [ %g1 ], %g2
74b0: 80 a0 a0 00 cmp %g2, 0
74b4: 02 48 00 0c be %icc, 74e4 <scsi_eh_wakeup+0x50>
74b8: 01 00 00 00 nop
74bc: e0 58 60 00 ldx [ %g1 ], %l0

I guess that points towards the binutils linker not doing the correct
thing.

ld reports its version as

$ ld -v
GNU ld (GNU Binutils for Debian) 2.20.1-system.20100303

and scsi_mod.ko is linked with the following command

ld -r -m elf64_sparc -T /richmtmp/linux-2.6-2.6.37/scripts/module-common.lds --build-id -o drivers/scsi/scsi_mod.ko drivers/scsi/scsi_mod.o drivers/scsi/scsi_mod.mod.o

Regards

Richard

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 00:18:57 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Mon, 17 Jan 2011 23:34:03 +0000
Post by Richard Mortimer
However the same R_SPARC_13 also exists in scsi_mod.ko. It exists in the
original Debian 2.6.37-trunk-sparc64 version and in my current build of
the same with the 8 byte alignment for _trace_events.
...

Thanks for the info Richard, I'll look more deeply into this.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 00:37:09 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Mon, 17 Jan 2011 23:34:03 +0000
Post by Richard Mortimer
I guess that points towards the binutils linker not doing the correct
thing.
Ok, it is in fact doing the correct thing.

I'm really surprised we never hit this before in all of these years
:-) I guess we've simply never hit this kind of expression in a module
before.

The issue is that modules aren't a "final link", it's really more like
an intermediate partial link.

So we do end up seeing the R_SPARC_LO10 + R_SPARC_13 sequences in the
final module object.

Therefore, we really should handle R_SPARC_13 in the sparc module loader.

Richard, I want you to get full credit for this since you did all of
the dirty work :-) Would you please cons up a formal patch with commit
message and signoff for this and I'll push it around?

Thanks a lot!
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-18 01:28:28 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 23:34:03 +0000
Post by Richard Mortimer
I guess that points towards the binutils linker not doing the correct
thing.
Ok, it is in fact doing the correct thing.
I'm really surprised we never hit this before in all of these years
:-) I guess we've simply never hit this kind of expression in a module
before.
The issue is that modules aren't a "final link", it's really more like
an intermediate partial link.
So we do end up seeing the R_SPARC_LO10 + R_SPARC_13 sequences in the
final module object.
Therefore, we really should handle R_SPARC_13 in the sparc module loader.
Richard, I want you to get full credit for this since you did all of
the dirty work :-) Would you please cons up a formal patch with commit
message and signoff for this and I'll push it around?
Thanks a lot!
Will do tomorrow. I'll dust off my git tree.

Regards

Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:50:22 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Mon, 17 Jan 2011 16:37:09 -0800 (PST)
Post by David Miller
So we do end up seeing the R_SPARC_LO10 + R_SPARC_13 sequences in the
final module object.
Therefore, we really should handle R_SPARC_13 in the sparc module loader.
Ok, I now feel like I'm hallucinating.

***@sunset:~/src/GIT/linux-2.6-stable$ uname -a
Linux sunset 2.6.37 #1 SMP Wed Jan 12 20:14:59 PST 2011 sparc64 GNU/Linux
***@sunset:~/src/GIT/linux-2.6-stable$ objdump --reloc /lib/modules/2.6.37/kernel/net/ipv6/ipv6.ko | grep R_SPARC_13
0000000000000c7c R_SPARC_13 *ABS*+0x0000000000000004
0000000000001ae4 R_SPARC_13 *ABS*+0x0000000000000018
0000000000001b0c R_SPARC_13 *ABS*+0x0000000000000008
...
***@sunset:~/src/GIT/linux-2.6-stable$ lsmod | grep ipv6
ipv6 240422 12
***@sunset:~/src/GIT/linux-2.6-stable$

I must be missing something obvious.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-18 10:52:07 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 16:37:09 -0800 (PST)
Post by David Miller
So we do end up seeing the R_SPARC_LO10 + R_SPARC_13 sequences in the
final module object.
Therefore, we really should handle R_SPARC_13 in the sparc module loader.
Ok, I now feel like I'm hallucinating.
Linux sunset 2.6.37 #1 SMP Wed Jan 12 20:14:59 PST 2011 sparc64 GNU/Linux
0000000000000c7c R_SPARC_13 *ABS*+0x0000000000000004
0000000000001ae4 R_SPARC_13 *ABS*+0x0000000000000018
0000000000001b0c R_SPARC_13 *ABS*+0x0000000000000008
...
ipv6 240422 12
I must be missing something obvious.
I think objdump may be distorting the truth a little. I found the
following in binutils gas/config/tc-sparc.c tc-gen_reloc(). I wonder if
it is displaying rewritten records rather than displaying the raw
contents. I haven't traced it through the code but the fact that it is
obviously working for you means that something like this is going on.

/* We expand R_SPARC_OLO10 to R_SPARC_LO10 and R_SPARC_13
on the same location. */
if (code == BFD_RELOC_SPARC_OLO10)
{
relocs[1] = reloc = (arelent *) xmalloc (sizeof (arelent));
relocs[2] = NULL;

reloc->sym_ptr_ptr = (asymbol **) xmalloc (sizeof (asymbol *));
*reloc->sym_ptr_ptr
= symbol_get_bfdsym (section_symbol (absolute_section));
reloc->address = fixp->fx_frag->fr_address + fixp->fx_where;
reloc->howto = bfd_reloc_type_lookup (stdoutput, BFD_RELOC_SPARC13);
reloc->addend = fixp->tc_fix_data;
}

I will try your alignment patch without any R_SPARC_13 related changes
and see how that goes.

Regards

Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Richard Mortimer
2011-01-18 13:23:14 UTC
Permalink
Post by Richard Mortimer
Post by David Miller
Date: Mon, 17 Jan 2011 16:37:09 -0800 (PST)
Post by David Miller
So we do end up seeing the R_SPARC_LO10 + R_SPARC_13 sequences in the
final module object.
Therefore, we really should handle R_SPARC_13 in the sparc module loader.
Ok, I now feel like I'm hallucinating.
Linux sunset 2.6.37 #1 SMP Wed Jan 12 20:14:59 PST 2011 sparc64 GNU/Linux
0000000000000c7c R_SPARC_13 *ABS*+0x0000000000000004
0000000000001ae4 R_SPARC_13 *ABS*+0x0000000000000018
0000000000001b0c R_SPARC_13 *ABS*+0x0000000000000008
...
ipv6 240422 12
I must be missing something obvious.
I think objdump may be distorting the truth a little. I found the
following in binutils gas/config/tc-sparc.c tc-gen_reloc(). I wonder if
it is displaying rewritten records rather than displaying the raw
contents. I haven't traced it through the code but the fact that it is
obviously working for you means that something like this is going on.
/* We expand R_SPARC_OLO10 to R_SPARC_LO10 and R_SPARC_13
on the same location. */
if (code == BFD_RELOC_SPARC_OLO10)
{
relocs[1] = reloc = (arelent *) xmalloc (sizeof (arelent));
relocs[2] = NULL;
reloc->sym_ptr_ptr = (asymbol **) xmalloc (sizeof (asymbol *));
*reloc->sym_ptr_ptr
= symbol_get_bfdsym (section_symbol (absolute_section));
reloc->address = fixp->fx_frag->fr_address + fixp->fx_where;
reloc->howto = bfd_reloc_type_lookup (stdoutput, BFD_RELOC_SPARC13);
reloc->addend = fixp->tc_fix_data;
}
Dave,

To close this off as a non-issue as far as my boot failures are
concerned I did some further checking and objdump is displaying
R_SPARC_OLO10 as two separate entries. I checked the scsi_mod.ko binary
and found the appropriate Elf64_Rela entry. That has 0x21 as the LSB of
r_info and that is the proper code for R_SPARC_OLO10 which is what you
expected in the first place!

objdump displays

0000000000001618 R_SPARC_LO10 __tracepoint_module_get
0000000000001618 R_SPARC_13 *ABS*+0x0000000000000008

The binary contains

0505140 00 00 00 00 00 00 16 18 00 00 04 0b 00 00 08 21
^^
0505160 00 00 00 00 00 00 00 00

which corresponds to

typedef struct elf64_rela {
Elf64_Addr r_offset; /* Location at which to apply the action */
Elf64_Xword r_info; /* index and type of relocation */
Elf64_Sxword r_addend; /* Constant addend used to compute value */
} Elf64_Rela;

Regards

Richard
Post by Richard Mortimer
I will try your alignment patch without any R_SPARC_13 related changes
and see how that goes.
Regards
Richard
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 21:00:27 UTC
Permalink
From: Richard Mortimer <***@oldelvet.org.uk>
Date: Tue, 18 Jan 2011 13:23:14 +0000
Post by Richard Mortimer
To close this off as a non-issue as far as my boot failures are
concerned I did some further checking and objdump is displaying
R_SPARC_OLO10 as two separate entries. I checked the scsi_mod.ko binary
and found the appropriate Elf64_Rela entry. That has 0x21 as the LSB of
r_info and that is the proper code for R_SPARC_OLO10 which is what you
expected in the first place!
Thanks for figuring this out Richard.

I'll look into fixing binutils so that it properly reports the
correct R_SPARC_OLO10 relocation in dumps. There really is no
excuse for what it's currently doing. In fact, I think this
quirk has sent me on wild goose chases in the past.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-19 04:12:27 UTC
Permalink
From: David Miller <***@davemloft.net>
Date: Tue, 18 Jan 2011 13:00:27 -0800 (PST)
Post by David Miller
I'll look into fixing binutils so that it properly reports the
correct R_SPARC_OLO10 relocation in dumps. There really is no
excuse for what it's currently doing. In fact, I think this
quirk has sent me on wild goose chases in the past.
Ok, after looking into it I got reminded why this is still behaving
the way that it is.

The problem is that R_SPARC_OLO10 relocations have 2 addends instead
of one. But the BFD libraries generic "arelent" structure only has
room to store one of the addends.

So the 64-bit ELF Sparc backend emits R_SPARC_OLO10 as R_SPARC_LO10 +
R_SPARC_13 internally solely for the purpose of being able to store
the second addend away somewhere.

This all happens in the relocation table slurping code in
bfd/elf64-sparc.c, and this is what objdump uses to get it's info.
BTW, readelf prints the relocations of this type properly and
without this weird translation.

I considered several avenues to make this translation scheme
unnecessary.

Making arelent larger in order to store the second addend would be
frowned upon since only 64-bit sparc needs it but the space cost would
be eaten by everyone.

The other idea was to somehow make the existing addend field smaller
on 64-bit so we could add the second addend storage to arelent at
zero cost. But after studying the binutils code I'm pretty sure
that the addend field really does need to be a full 64-bits.

This really needs to be fixed for another reason, which is that this
scheme requires allocating twice as much memory to store relocations
internally. This is because we have to size the internal relocation
table in BFD before we scan the table and know if any R_SPARC_OLO10
compound relocs actually exist and if so how many.

An even worse case is MIPS 64-bit, where every relocation expands to
3 (?!?!) internal BFD relocation objects.

Anyways just a detailed brain dump on what the situation is with this,
I'll see if I can come up with other ideas.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Bernhard R. Link
2011-01-17 14:39:54 UTC
Permalink
Post by David Miller
Although I've seen commentary to the contrary, in fact using a too-small
__attribute__((aligned())) directive will lower the alignment of data
members, and yes that means it will lower the alignemnt to be below the
natural and required alignment for the given type.
struct foo {
void *bar;
};
static struct foo test __attribute__((__aligned__(4)));
The compiler will emit "test" with 4-byte alignment into the data
section, even though 8-byte alignment is required for "test.bar"
Assuming we wanted that to actually happen, the GCC manual is very
explicit to state that in order for this to work, such down-aligned
data structures must also use the "packed" attribute.
The manual seems to have changed in that regard.

http://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Variable-Attributes.html
and earlier versions say:

"The aligned attribute can only increase the alignment; but you can
decrease it by specifying packed as well. See below."

but http://gcc.gnu.org/onlinedocs/gcc-4.3.5/gcc/Variable-Attributes.html
and later versions say:

"When used on a struct, or struct member, the aligned attribute can only
increase the alignment; in order to decrease it, the packed attribute
must be specified as well. When used as part of a typedef, the aligned
attribute can both increase and decrease alignment, and specifying the
packed attribute will generate a warning."

And seems to still be a bit confusing, as an attribute in a
variable declaration seems to count as typedef:

| struct foo {
| void *bar;
| };
|
| static struct foo a __attribute__((__aligned__(2)));
| static struct foo b;
|
| struct foo2 {
| void *bar;
| } __attribute__((__aligned__(2)));
|
| static struct foo2 c __attribute__((__aligned__(2)));
| static struct foo2 d;
|
| struct foo3 {
| void *bar;
| } __attribute__((__aligned__(2))) __attribute__((__packed__));
|
| static struct foo3 e;
|
| int main() {
| printf("a: %d, b: %d, c: %d, d: %d, e: %d\n",
| __alignof__(a), __alignof__(b),
| __alignof__(c), __alignof__(d),
| __alignof__(e)
| );
| return 0;
| }

gives something like:
a: 2, b: 4, c: 2, d: 4, e: 2
or on sparc64:
a: 2, b: 8, c: 2, d: 8, e: 2
Post by David Miller
I think we want none of this, and I think we should elide the align
directives entirely, or at least fix them so we don't get unaligned
stuff on 64-bit.
One fix might be to move the __attribute__ from include/trace/ftrace.h
(and from include/linux/syscalls.h) to include/linux/ftrace_event.h
and attach it to the struct there. This way it should only increase it.
Post by David Miller
Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.
Is that actually misaligned? Unless I still mix things up, that is in the
struct thus should be fine (i.e. the "d" case in my example above).

Bernhard R. Link
David Miller
2011-01-18 05:24:44 UTC
Permalink
From: "Bernhard R. Link" <brl+***@pcpool00.mathematik.uni-freiburg.de>
Date: Mon, 17 Jan 2011 15:39:54 +0100
Post by Bernhard R. Link
Post by David Miller
Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.
Is that actually misaligned? Unless I still mix things up, that is in the
struct thus should be fine (i.e. the "d" case in my example above).
On CRIS, structs naturally align on a byte-boundary only.

However, code using klists encodes a binary state in the lowest bit of
klist pointers. So this assumes that the structures will be at least
2 byte aligned, which will not be true on CRIS.

We have a lot of other code which uses this trick (encoding 1 or 2
bits of binary state into the lowest bits of a pointer) so I'm
surprised this workaround isn't needed elsewhere for CRIS too.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jesper Nilsson
2011-01-18 09:26:39 UTC
Permalink
Post by David Miller
Date: Mon, 17 Jan 2011 15:39:54 +0100
Post by Bernhard R. Link
Post by David Miller
Ugh, and I just noticed that include/linux/klist.h does this fixed
alignment of "4" too, where is this stuff coming from? It's
wrong on 64-bit, at best. But I can't see the impetus behind doing
this at all in the first place.
Is that actually misaligned? Unless I still mix things up, that is in the
struct thus should be fine (i.e. the "d" case in my example above).
On CRIS, structs naturally align on a byte-boundary only.
However, code using klists encodes a binary state in the lowest bit of
klist pointers. So this assumes that the structures will be at least
2 byte aligned, which will not be true on CRIS.
We have a lot of other code which uses this trick (encoding 1 or 2
bits of binary state into the lowest bits of a pointer) so I'm
surprised this workaround isn't needed elsewhere for CRIS too.
Surprisingly, we haven't found any other place where this is an issue
for CRIS, either the code isn't used on CRIS, or the aligment is
ensured by some other means (kmalloc for example).

/^JN - Jesper Nilsson
--
Jesper Nilsson -- ***@axis.com
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2011-01-18 06:27:24 UTC
Permalink
From: "Bernhard R. Link" <brl+***@pcpool00.mathematik.uni-freiburg.de>
Date: Mon, 17 Jan 2011 15:39:54 +0100
Post by Bernhard R. Link
Post by David Miller
I think we want none of this, and I think we should elide the align
directives entirely, or at least fix them so we don't get unaligned
stuff on 64-bit.
One fix might be to move the __attribute__ from include/trace/ftrace.h
(and from include/linux/syscalls.h) to include/linux/ftrace_event.h
and attach it to the struct there. This way it should only increase it.
I'm beginning to think that the align directive is there purposely to
down-align the structure so that the amount of space that tracing
information consumes is minimized.

I honestly can't tell, only Steven Rostedt can tell us for sure,
because there are no commit messages or comments that explain why
these things need to be there.

If the align directives exist for that reason, then your suggestion
would likely completely undo what these directives are trying to
achieve.

Someone mentioned that the default struct alignment on x86-64 is
something rather rediculious like 32 bytes or something like that.
Yet someone else suggested to use __aligned__(32) to fix this, so
color me completely confused.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Rostedt
2011-01-18 17:05:44 UTC
Permalink
Post by David Miller
I'm beginning to think that the align directive is there purposely to
down-align the structure so that the amount of space that tracing
information consumes is minimized.
I honestly can't tell, only Steven Rostedt can tell us for sure,
because there are no commit messages or comments that explain why
these things need to be there.
You may have missed my earlier repsonse.

The alignment was there to keep the linker from adding holes into the
sections that store this data. As the tracepoints are processed at boot
up like an array (as the initcalls are done). If the linker adds space
into the section as it puts the sections from all the object files
together, it will crash the kernel as we read that section as an array.

The min alignment I used solved this issue (which I hit on x86_64). But
I could also have made the structures aligned to a bigger alignment,
which should also work. But this will add holes between each item in the
array.
Post by David Miller
If the align directives exist for that reason, then your suggestion
would likely completely undo what these directives are trying to
achieve.
Someone mentioned that the default struct alignment on x86-64 is
something rather rediculious like 32 bytes or something like that.
Yet someone else suggested to use __aligned__(32) to fix this, so
color me completely confused.
If you have another idea to keep the linker from breaking the section up
where it can't be read as an array, I'm all ears :-)

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Continue reading on narkive:
Loading...