Hi,
i need to "expand" a raw datatable like
1, 'Harry', 'London'
2, 'Peter', null
3, null, 'New York'
into this format
1, 'Harry', 'London'
2, 'Peter', 'London'
3, 'Peter', 'New York'
so, for several columns, null values should be replaced with the values of the previous (non-null) row. the raw data is bulk-inserted and cannot be modified before.
is there anything other (faster!) than using a cursor ?
i like to have on single update command to do the job.does each row in the table have a unique sequential key?|||Originally posted by Paul Young
does each row in the table have a unique sequential key?
yes, the first column is an identity|||why not join the table to its self based on the identity column?
select *
from yourbcptable ybt1
join yourbcptable ybt2 on ybt1.IdentityColumn - 1 = ybt2.IdentityColumn
where ybt1.IdentityColumn > 1
This assumes your identity seed = 1|||yes, the identity seed is 1, so your select works,
but what about multiple null rows in order like
1, 'Peter', 'london'
2, 'Markus', null
3, 'Mary', null
for row 3 the row (id-1) will be null as well?|||either a row at a time approach or run the update till you no longer have nulls or wrap everything up in a stored procedure.
probably should have added more to the where caluse
where ybt1.IdentityColumn > 1
and ytb1.col1 is null or ybt1.col2 is null
BTW, I would NOT use a CURSOR as these tend to be resource intensive.|||i got closer. the following update will nearly do the job, but the update order needs to be by the identity column. is there a way to set the order for updates ?
update mytable set
@.Column1 = Column1 = isnull( Column1, @.Column1 ),
@.Column2 = Column2 = isnull( Column2, @.Column2 )
this will set column1 to column1 (keep the value) when it is not null or will set column1=@.column1, which holds then "not null"-value of the rows before. works fine - but only when processed in correct order !|||No guaranties on the order. SQL Server will process in the most efficient manner.
You might try setting a clustered index on the key, but I would check the results very closely.|||--create table T (a int identity, N varchar(10), C varchar(10))
/*
delete from T
insert into T (n,c) select 'Harry','London'
insert into T (n,c) select 'Peter',null
insert into T (n,c) select null,'New York'
*/
select * from T
update nxt set N=isnull(nxt.N,prv.N),C=isnull(nxt.C,prv.C)
from T nxt join T prv on nxt.a=prv.a+1
select * from T|||Originally posted by kukuk
--create table T (a int identity, N varchar(10), C varchar(10))
/*
delete from T
insert into T (n,c) select 'Harry','London'
insert into T (n,c) select 'Peter',null
insert into T (n,c) select null,'New York'
*/
select * from T
update nxt set N=isnull(nxt.N,prv.N),C=isnull(nxt.C,prv.C)
from T nxt join T prv on nxt.a=prv.a+1
select * from T
correct - that's what we discussed before,
but what about multiple null rows in order like
1, 'Peter', 'london'
2, 'Markus', null
3, 'Mary', null
for row 3 the row (id-1) will be null as well?
there might be many null-rows after one data-row.|||:)
while exists (select * from T where a>1 and (C is null or N is null) )
update nxt set N=isnull(nxt.N,prv.N),C=isnull(nxt.C,prv.C)
from T nxt join T prv on nxt.a=prv.a+1
(will work only if the first row does not contain nulls)
or
update x set
N=isnull(N,(select top 1 N from T as y where x.a>y.a and y.N is not null order by a desc)),
C=isnull(C,(select top 1 C from T as z where x.a>z.a and z.C is not null order by a desc))
from T as x
(expensive one)|||Originally posted by kukuk
:)
while exists (select * from T where a>1 and (C is null or N is null) )
update nxt set N=isnull(nxt.N,prv.N),C=isnull(nxt.C,prv.C)
from T nxt join T prv on nxt.a=prv.a+1
(will work only if the first row does not contain nulls)
or
update x set
N=isnull(N,(select top 1 N from T as y where x.a>y.a and y.N is not null order by a desc)),
C=isnull(C,(select top 1 C from T as z where x.a>z.a and z.C is not null order by a desc))
from T as x
(expensive one)
the first one seems to be the one ...
i'll give it a try and come back later|||hi kukuk,
i think i got it:
declare @.A int, @.B varchar(20), @.C varchar(20)
select top 1 @.A = A from Tbl order by ID
while @.@.RowCount>0 begin
update Tbl
set @.B= B= isnull( B, @.B ),
@.C= C= isnull( C, @.C ),
@.A=A+1
from Tbl
where A=@.A
end
its a mix of your first select and my variable-version. your select works fine too, but it is much more expensive than the one above.
thanks to all for your time
markus|||Yes but the number of iterations will be equal to nuimber of rows in the table.|||Originally posted by kukuk
Yes but the number of iterations will be equal to nuimber of rows in the table.
true, but i don't care as long as it runs much faster.
Subscribe to:
Post Comments (Atom)

No comments:
Post a Comment