
|
Cleaning up legacy data in SQL Server
If you want to import old data into a new database in order to clean it up, you can use stored procedures and cursors. For example, you might need to do this if you've got a database with phone numbers entered in a variety of formats and want to convert them all to a single, consistent format. To illustrate how to transfer and clean up data, we're going to use the pubs sample database (included with SQL Server). As an example, we're going to show you how you could strip out the dashes in the au_id column of the authors table. As part of this process, we'll need to copy the au_id, au_lname, and au_fname columns to another table; in our example, we used a table name of authorIDs. Here are the steps:
1. Create the authorIDs table: CREATE TABLE dbo.authorIDs ( au_id id NOT NULL , au_lname varchar (40) NOT NULL , au_fname varchar (20) NOT NULL , ) GO 2. Next, copy the data from the authors table to authorIDs: insert authorids(au_id, au_fname, au_lname) select au_id, au_fname, au_lname from authors
3. Create the following stored procedure to strip out the dashes in the au_id column (but don't run it yet): CREATE PROCEDURE scrub_au_ID @au_id char(11), @au_id_new varchar(11) output AS set nocount on declare @counter int select @au_id_new="" select @counter =1 while @counter <11 begin if ascii(substring(@au_id, @counter,1)) >47 and ascii(substring(@au_id, @counter,1))<58 select @au_id_new = @au_id_new + substring(@au_id,@counter,1) select @counter = @counter +1 end GO
4. Now create the following stored procedure (again, don't run it yet). This procedure creates a cursor of the author IDs, calls the scrub procedure, and then updates the authorIDs table based on the cursor position: CREATE PROCEDURE auth_ids AS DECLARE @au_id varchar(11) DECLARE @au_id_new varchar(11) DECLARE authids_cursor CURSOR FOR SELECT au_id FROM authorids OPEN authids_cursor FETCH NEXT FROM authids_cursor INTO @au_id WHILE (@@FETCH_STATUS <> -1) BEGIN IF (@@FETCH_STATUS <> -2) BEGIN execute scrub_au_id @au_id, @au_id_new output /*update authorids set au_id=@au_id_new where au_id=@au_id*/ update authorids set au_id=@au_id_new WHERE CURRENT OF authids_cursor END FETCH NEXT FROM authids_cursor INTO @au_id END DEALLOCATE authids_cursor GO
5. You're now ready to clean up your data. Run the following commands: execute auth_ids select * from authorids
|
||||
| Search RD Techbase | ||||
![]()