Posts

Showing posts from November, 2021

SQL Server, how to improve privacy (and be more gdpr complaiant) at no cost with the "dynamic data masking" feature

Image
Hi guys, Welcome back mates!  Today we will talk about privacy because, at least here in Europe, GDPR laws are not a joke. For this reason I decided to tell you how to be more compliant using a SQL Server function, all at no cost . We will use the dynamic data masking (DDM) feature introduced with the SQL server 2016 edition. Let's see today how to use it in practice! The dynamic data masking feature The dinamic data masked is a feature that allows you to mask data that we consider sensitive . By performing a select on the database, the columns we have chosen, will be shown in a masked way . Personally, I strongly recommend it to all companies that develop softwar e. Let's see how it works! Well, this time for the demo we will use a localized demo databases available for SAP Business One, its name is SBODemoUS. From this database we read the AADM table by extracting a few fields, a couple of which we think are sensitive . The T-SQL command we run is this: SELECT DISTINCT [

SQL Server, SQL Server, fast data reading with table partitioning! ...Vamonos!

Image
Hi Guys, Welcome back for this midweek post! Performance has always been our favorite topic and then even today we see how to read the data faster using the table partitioning! Are you ready? ...as my Mexican friends would say ... vamonos!   What is the table partitioning? Table partitioning is a data organization scheme where the table data is divided into multiple storage objects called partitions. When a table becomes large it might be useful to partition the data and this is done following a rule. For example, if we had a huge list of people we could think of creating a partition for each initial letter of the name . The first partition would contain all the people whose name starts with the letter A, the second partition would contain all the people who start with the letter B, and so on. The nice and interesting aspect is that when we search the table for a person starting with the letter A we will only read the data from the first partition. All other partitions would not be a

SQL Server, other ideas for a fast insert ...multivalues!

Image
Hi Guys, Welcome back!   After yesterday's post , here is another one today, always light! Another trick for faster inserts! Enjoy the reading! Multivalues The T-SQL syntax allows the specification of more than one set of values ​​in the INSERT statement  For example: INSERT INTO Table (code,descr) VALUES ('0001','first item'), ('0002','second item'), ('0003','third item') We can specify a maximum of 1000 set of values. Let's see what we can do with it ... In some cases we could use this possibility to gain speed. The test Come on! follow me!   Let's create a simple heap table and insert some data, let's say 100,000 rows CREATE TABLE MOV (id int, Qty float) Now i use this loop to fill the table: DECLARE @i INTEGER ; SET @i = 0; WHILE @i < 100000 -- 100K BEGIN EXEC (' INSERT INTO MOV (id) VALUES (1) ') SET @i = @i + 1 END No

SQL Server, How to do a fast massive insert

Image
Hi Guys! Today a light post to read to start the week!   An easy and practical trick related to massive insertions. Do we insert indexes before or after populating a table? Enjoy the reading!     A massive insert Suppose you have to insert a large number of rows into a table that doesn't exist yet. This table will have a clustered index let's say on the ID field. Let's start with a question: When do we create our clustered index? I have seen many times procedures that created the table and put the clustered index on it, then mass insertion took place. Let's do our test! Let's create our table with the command: CREATE TABLE [dbo].[Movements]( [Id] [int] [Qty] [float] NULL, [Price] [float] NULL ) ON [PRIMARY] Right now our table has no indexes so it's called a heap table . Let's create our clustered index: CREATE CLUSTERED INDEX CI_MOVEMENTS_ID ON Movements(ID) Now we insert 10 million rows with this command: INSERT INTO Movement

SQL Server: Grouped Aggregate Pushdown. Make columnstore indexes go faster with aggregate pushdown functionality!

Image
Hi Guys,   Today as promised we will talk about an important feauture on the columnstore indexes called Grouped Aggregate Pushdown . We already talked about this feature two post ago here: SQL Server, the "Group By" competition: from 30 seconds to 56 milliseconds and here: SQL Server, the "Distinct clause" competition. Oops i did it again! ...from 24 seconds to 6 milliseconds!   We have seen that a columnstore index can return directly already filtered data so this feature is very powerful ! I want to repeat the official definition because in my opinion it is very beautiful: Some aggregate computations can be pushed to the scan node instead of calculated in an aggregate node. Where is the problem? Activating this function is not really trivial! Let's see, even with the help of a few examples, how to do it. Enjoy the reading mate!   "Summary of previous episodes".. let's recap! In the example we had done in the previous post we had run the followi

Ladies and gentlemen ...drumroll... SQL Server 2022 is among us! New features and improvements!

Image
Hi Guys, Breaking news! We are in the midst of the top Microsoft event called Microsoft Ignite and many of us were waiting for some data on the successor of SQL Server 2019.  Since yesterday the new version of SQL Server has a name: SQL Server 2022 yes, the new SQL Server 2022 was been unveiled yesterday morning during Executive Vice President Scott Guthrie's keynote. The product is only in private preview and so few details are available. As our blog talks about performance what we're interested in are improvements to the key performance features of the database engine . We will try to understand from a performance point of view what new features await us. We are waiting to get our hands on the first available version ( and this should happen soon ) We can't wait to start testing .. so follow me! New features and improvements in SQL Server 2022 For some years now, Microsoft has directed the development of the optimizer in the direction of increasing its ability to  modi

SQL Server, the "Distinct clause" competition. Oops i did it again! ...from 24 seconds to 6 milliseconds!

Image
Hi guys, Thank you all mates, the last post was very successful and therefore I propose its continuation. I recommend you read it before reading this post. We were left to the point of using the feature called grouped aggregate pushdown applied to a columnstore index. The shortest run time obtained was 56 milliseconds (I remember that we started from 30 seconds ..omg)   Introduction In the last post we "played" with the "Group By". With a "Group By" clause we can use aggregate functions like COUNT, MIN,MAX,etc and perform calculation on a group of rows to return unique value: SELECT Classifiers, COUNT (Classifiers) FROM Movements GROUP BY Classifiers But if you need only to remove duplicate values and so extract distinct values we can use a DISTINCT operator we can simply write this T-SQL command: SELECT DISTINCT Classifiers FROM Movements Today we apply the same concepts we saw in the last post to a query that contains a distinct