How to get the last not-null value in an ordered column of a huge table? The Next CEO of Stack OverflowGet records not updated in last 30 minuteshow to set default value of column 0 when NULL is inseertedsql server: updating fields on huge table in small chunks: how to get progress/status?How to replace column if not null in select statement?NULL value self comparison in a tableHow to get data from different table having same column?Percentage difference of the last value from the previous values of a column based on certain data group within the same tableDo not add a comma in front of the string if value is null or emptyHOW to work with NULL in a NOT NULL column?How to get last 12 months values when some months have no records in the table

Calculate the Mean mean of two numbers

How to set page number in right side in chapter title page?

Inexact numbers as keys in Association?

What would be the main consequences for a country leaving the WTO?

In the "Harry Potter and the Order of the Phoenix" videogame, what potion is used to sabotage Umbridge's Speakers?

Is it correct to say moon starry nights?

Is there a reasonable and studied concept of reduction between regular languages?

What is the difference between "hamstring tendon" and "common hamstring tendon"?

How do I fit a non linear curve?

Aggressive Under-Indexing and no data for missing index

How to use ReplaceAll on an expression that contains a rule

Is it professional to write unrelated content in an almost-empty email?

Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico

Why is information "lost" when it got into a black hole?

Is a distribution that is normal, but highly skewed, considered Gaussian?

What difference does it make using sed with/without whitespaces?

Reference request: Grassmannian and Plucker coordinates in type B, C, D

Why the last AS PATH item always is `I` or `?`?

Can I board the first leg of the flight without having final country's visa?

Players Circumventing the limitations of Wish

What happened in Rome, when the western empire "fell"?

What connection does MS Office have to Netscape Navigator?

Are the names of these months realistic?

0-rank tensor vs vector in 1D



How to get the last not-null value in an ordered column of a huge table?



The Next CEO of Stack OverflowGet records not updated in last 30 minuteshow to set default value of column 0 when NULL is inseertedsql server: updating fields on huge table in small chunks: how to get progress/status?How to replace column if not null in select statement?NULL value self comparison in a tableHow to get data from different table having same column?Percentage difference of the last value from the previous values of a column based on certain data group within the same tableDo not add a comma in front of the string if value is null or emptyHOW to work with NULL in a NOT NULL column?How to get last 12 months values when some months have no records in the table










5















I have to following input:



 id | value 
----+-------
1 | 136
2 | NULL
3 | 650
4 | NULL
5 | NULL
6 | NULL
7 | 954
8 | NULL
9 | 104
10 | NULL


I expect the following result:



 id | value 
----+-------
1 | 136
2 | 136
3 | 650
4 | 650
5 | 650
6 | 650
7 | 954
8 | 954
9 | 104
10 | 104


The trivial solution would be join the tables with a < relation, and then selecting the MAX value in a GROUP BY:



WITH tmp AS (
SELECT t2.id, MAX(t1.id) AS lastKnownId
FROM t t1, t t2
WHERE
t1.value IS NOT NULL
AND
t2.id >= t1.id
GROUP BY t2.id
)
SELECT
tmp.id, t.value
FROM t, tmp
WHERE t.id = tmp.lastKnownId;


However, the trivial execution of this code would create internally the square of the count of the rows of the input table ( O(n^2) ). I expected t-sql to optimize it out - on a block/record level, the task to do is very easy and linear, essentially a for loop ( O(n) ).



However, on my experiments, the latest MS SQL 2016 can't optimize this query correctly, making this query impossible to execute for a large input table.



Furthermore, the query has to run quickly, making a similarly easy (but very different) cursor-based solution infeasible.



Using some memory-backed temporary table could be a good compromise, but I am not sure if it can be run significantly quicker, considered that my example query using subqueries didn't work.



I am also thinking on to dig out some windowing function from the t-sql docs, what could be tricked to do what I want. For example, cumulative sum is doing some very similar, but I couldn't trick it to give the latest non-null element, and not the sum of the elements before.



The ideal solution would be a quick query without procedural code or temporary tables. Alternatively, also a solution with temporary tables is okay, but iterating the table procedurally is not.










share|improve this question
























  • I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

    – jmoreno
    1 hour ago















5















I have to following input:



 id | value 
----+-------
1 | 136
2 | NULL
3 | 650
4 | NULL
5 | NULL
6 | NULL
7 | 954
8 | NULL
9 | 104
10 | NULL


I expect the following result:



 id | value 
----+-------
1 | 136
2 | 136
3 | 650
4 | 650
5 | 650
6 | 650
7 | 954
8 | 954
9 | 104
10 | 104


The trivial solution would be join the tables with a < relation, and then selecting the MAX value in a GROUP BY:



WITH tmp AS (
SELECT t2.id, MAX(t1.id) AS lastKnownId
FROM t t1, t t2
WHERE
t1.value IS NOT NULL
AND
t2.id >= t1.id
GROUP BY t2.id
)
SELECT
tmp.id, t.value
FROM t, tmp
WHERE t.id = tmp.lastKnownId;


However, the trivial execution of this code would create internally the square of the count of the rows of the input table ( O(n^2) ). I expected t-sql to optimize it out - on a block/record level, the task to do is very easy and linear, essentially a for loop ( O(n) ).



However, on my experiments, the latest MS SQL 2016 can't optimize this query correctly, making this query impossible to execute for a large input table.



Furthermore, the query has to run quickly, making a similarly easy (but very different) cursor-based solution infeasible.



Using some memory-backed temporary table could be a good compromise, but I am not sure if it can be run significantly quicker, considered that my example query using subqueries didn't work.



I am also thinking on to dig out some windowing function from the t-sql docs, what could be tricked to do what I want. For example, cumulative sum is doing some very similar, but I couldn't trick it to give the latest non-null element, and not the sum of the elements before.



The ideal solution would be a quick query without procedural code or temporary tables. Alternatively, also a solution with temporary tables is okay, but iterating the table procedurally is not.










share|improve this question
























  • I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

    – jmoreno
    1 hour ago













5












5








5








I have to following input:



 id | value 
----+-------
1 | 136
2 | NULL
3 | 650
4 | NULL
5 | NULL
6 | NULL
7 | 954
8 | NULL
9 | 104
10 | NULL


I expect the following result:



 id | value 
----+-------
1 | 136
2 | 136
3 | 650
4 | 650
5 | 650
6 | 650
7 | 954
8 | 954
9 | 104
10 | 104


The trivial solution would be join the tables with a < relation, and then selecting the MAX value in a GROUP BY:



WITH tmp AS (
SELECT t2.id, MAX(t1.id) AS lastKnownId
FROM t t1, t t2
WHERE
t1.value IS NOT NULL
AND
t2.id >= t1.id
GROUP BY t2.id
)
SELECT
tmp.id, t.value
FROM t, tmp
WHERE t.id = tmp.lastKnownId;


However, the trivial execution of this code would create internally the square of the count of the rows of the input table ( O(n^2) ). I expected t-sql to optimize it out - on a block/record level, the task to do is very easy and linear, essentially a for loop ( O(n) ).



However, on my experiments, the latest MS SQL 2016 can't optimize this query correctly, making this query impossible to execute for a large input table.



Furthermore, the query has to run quickly, making a similarly easy (but very different) cursor-based solution infeasible.



Using some memory-backed temporary table could be a good compromise, but I am not sure if it can be run significantly quicker, considered that my example query using subqueries didn't work.



I am also thinking on to dig out some windowing function from the t-sql docs, what could be tricked to do what I want. For example, cumulative sum is doing some very similar, but I couldn't trick it to give the latest non-null element, and not the sum of the elements before.



The ideal solution would be a quick query without procedural code or temporary tables. Alternatively, also a solution with temporary tables is okay, but iterating the table procedurally is not.










share|improve this question
















I have to following input:



 id | value 
----+-------
1 | 136
2 | NULL
3 | 650
4 | NULL
5 | NULL
6 | NULL
7 | 954
8 | NULL
9 | 104
10 | NULL


I expect the following result:



 id | value 
----+-------
1 | 136
2 | 136
3 | 650
4 | 650
5 | 650
6 | 650
7 | 954
8 | 954
9 | 104
10 | 104


The trivial solution would be join the tables with a < relation, and then selecting the MAX value in a GROUP BY:



WITH tmp AS (
SELECT t2.id, MAX(t1.id) AS lastKnownId
FROM t t1, t t2
WHERE
t1.value IS NOT NULL
AND
t2.id >= t1.id
GROUP BY t2.id
)
SELECT
tmp.id, t.value
FROM t, tmp
WHERE t.id = tmp.lastKnownId;


However, the trivial execution of this code would create internally the square of the count of the rows of the input table ( O(n^2) ). I expected t-sql to optimize it out - on a block/record level, the task to do is very easy and linear, essentially a for loop ( O(n) ).



However, on my experiments, the latest MS SQL 2016 can't optimize this query correctly, making this query impossible to execute for a large input table.



Furthermore, the query has to run quickly, making a similarly easy (but very different) cursor-based solution infeasible.



Using some memory-backed temporary table could be a good compromise, but I am not sure if it can be run significantly quicker, considered that my example query using subqueries didn't work.



I am also thinking on to dig out some windowing function from the t-sql docs, what could be tricked to do what I want. For example, cumulative sum is doing some very similar, but I couldn't trick it to give the latest non-null element, and not the sum of the elements before.



The ideal solution would be a quick query without procedural code or temporary tables. Alternatively, also a solution with temporary tables is okay, but iterating the table procedurally is not.







t-sql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 4 hours ago







peterh

















asked 4 hours ago









peterhpeterh

1,08441431




1,08441431












  • I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

    – jmoreno
    1 hour ago

















  • I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

    – jmoreno
    1 hour ago
















I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

– jmoreno
1 hour ago





I think this might be possible with lag (certainly if there was only one row between values), but can’t quite put my finger on it right now and don’t have the environment to play with it.

– jmoreno
1 hour ago










1 Answer
1






active

oldest

votes


















3














One method, by using OVER() and MAX() and COUNT() based on this source could be:



SELECT ID, MAX(value) OVER (PARTITION BY Value2) as value
FROM
(
SELECT ID, value
,COUNT(value) OVER (ORDER BY ID) AS Value2
FROM dbo.HugeTable
) a
ORDER BY ID;


Result



Id UpdatedValue
1 136
2 136
3 650
4 650
5 650
6 650
7 954
8 954
9 104
10 104



Another method based on this source, closely related to the first example



;WITH CTE As 
(
SELECT value,
Id,
COUNT(value)
OVER(ORDER BY Id) As Value2
FROM dbo.HugeTable
),

CTE2 AS (
SELECT Id,
value,
First_Value(value)
OVER( PARTITION BY Value2
ORDER BY Id) As UpdatedValue
FROM CTE
)
SELECT Id,UpdatedValue
FROM CTE2;





share|improve this answer

























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "182"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f233610%2fhow-to-get-the-last-not-null-value-in-an-ordered-column-of-a-huge-table%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    One method, by using OVER() and MAX() and COUNT() based on this source could be:



    SELECT ID, MAX(value) OVER (PARTITION BY Value2) as value
    FROM
    (
    SELECT ID, value
    ,COUNT(value) OVER (ORDER BY ID) AS Value2
    FROM dbo.HugeTable
    ) a
    ORDER BY ID;


    Result



    Id UpdatedValue
    1 136
    2 136
    3 650
    4 650
    5 650
    6 650
    7 954
    8 954
    9 104
    10 104



    Another method based on this source, closely related to the first example



    ;WITH CTE As 
    (
    SELECT value,
    Id,
    COUNT(value)
    OVER(ORDER BY Id) As Value2
    FROM dbo.HugeTable
    ),

    CTE2 AS (
    SELECT Id,
    value,
    First_Value(value)
    OVER( PARTITION BY Value2
    ORDER BY Id) As UpdatedValue
    FROM CTE
    )
    SELECT Id,UpdatedValue
    FROM CTE2;





    share|improve this answer





























      3














      One method, by using OVER() and MAX() and COUNT() based on this source could be:



      SELECT ID, MAX(value) OVER (PARTITION BY Value2) as value
      FROM
      (
      SELECT ID, value
      ,COUNT(value) OVER (ORDER BY ID) AS Value2
      FROM dbo.HugeTable
      ) a
      ORDER BY ID;


      Result



      Id UpdatedValue
      1 136
      2 136
      3 650
      4 650
      5 650
      6 650
      7 954
      8 954
      9 104
      10 104



      Another method based on this source, closely related to the first example



      ;WITH CTE As 
      (
      SELECT value,
      Id,
      COUNT(value)
      OVER(ORDER BY Id) As Value2
      FROM dbo.HugeTable
      ),

      CTE2 AS (
      SELECT Id,
      value,
      First_Value(value)
      OVER( PARTITION BY Value2
      ORDER BY Id) As UpdatedValue
      FROM CTE
      )
      SELECT Id,UpdatedValue
      FROM CTE2;





      share|improve this answer



























        3












        3








        3







        One method, by using OVER() and MAX() and COUNT() based on this source could be:



        SELECT ID, MAX(value) OVER (PARTITION BY Value2) as value
        FROM
        (
        SELECT ID, value
        ,COUNT(value) OVER (ORDER BY ID) AS Value2
        FROM dbo.HugeTable
        ) a
        ORDER BY ID;


        Result



        Id UpdatedValue
        1 136
        2 136
        3 650
        4 650
        5 650
        6 650
        7 954
        8 954
        9 104
        10 104



        Another method based on this source, closely related to the first example



        ;WITH CTE As 
        (
        SELECT value,
        Id,
        COUNT(value)
        OVER(ORDER BY Id) As Value2
        FROM dbo.HugeTable
        ),

        CTE2 AS (
        SELECT Id,
        value,
        First_Value(value)
        OVER( PARTITION BY Value2
        ORDER BY Id) As UpdatedValue
        FROM CTE
        )
        SELECT Id,UpdatedValue
        FROM CTE2;





        share|improve this answer















        One method, by using OVER() and MAX() and COUNT() based on this source could be:



        SELECT ID, MAX(value) OVER (PARTITION BY Value2) as value
        FROM
        (
        SELECT ID, value
        ,COUNT(value) OVER (ORDER BY ID) AS Value2
        FROM dbo.HugeTable
        ) a
        ORDER BY ID;


        Result



        Id UpdatedValue
        1 136
        2 136
        3 650
        4 650
        5 650
        6 650
        7 954
        8 954
        9 104
        10 104



        Another method based on this source, closely related to the first example



        ;WITH CTE As 
        (
        SELECT value,
        Id,
        COUNT(value)
        OVER(ORDER BY Id) As Value2
        FROM dbo.HugeTable
        ),

        CTE2 AS (
        SELECT Id,
        value,
        First_Value(value)
        OVER( PARTITION BY Value2
        ORDER BY Id) As UpdatedValue
        FROM CTE
        )
        SELECT Id,UpdatedValue
        FROM CTE2;






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 3 hours ago

























        answered 4 hours ago









        Randi VertongenRandi Vertongen

        4,131924




        4,131924



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f233610%2fhow-to-get-the-last-not-null-value-in-an-ordered-column-of-a-huge-table%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Category:Fedor von Bock Media in category "Fedor von Bock"Navigation menuUpload mediaISNI: 0000 0000 5511 3417VIAF ID: 24712551GND ID: 119294796Library of Congress authority ID: n96068363BnF ID: 12534305fSUDOC authorities ID: 034604189Open Library ID: OL338253ANKCR AUT ID: jn19990000869National Library of Israel ID: 000514068National Thesaurus for Author Names ID: 341574317ReasonatorScholiaStatistics

            Reverse int within the 32-bit signed integer range: [−2^31, 2^31 − 1]Combining two 32-bit integers into one 64-bit integerDetermine if an int is within rangeLossy packing 32 bit integer to 16 bitComputing the square root of a 64-bit integerKeeping integer addition within boundsSafe multiplication of two 64-bit signed integersLeetcode 10: Regular Expression MatchingSigned integer-to-ascii x86_64 assembler macroReverse the digits of an Integer“Add two numbers given in reverse order from a linked list”

            Kiel Indholdsfortegnelse Historie | Transport og færgeforbindelser | Sejlsport og anden sport | Kultur | Kendte personer fra Kiel | Noter | Litteratur | Eksterne henvisninger | Navigationsmenuwww.kiel.de54°19′31″N 10°8′26″Ø / 54.32528°N 10.14056°Ø / 54.32528; 10.14056Oberbürgermeister Dr. Ulf Kämpferwww.statistik-nord.deDen danske Stats StatistikKiels hjemmesiderrrWorldCat312794080n790547494030481-4