Fun With Linq!


NOTE: this article was written against a beta version of the product or technology discussed in this article. There is absolutely no guarantee that the solution or code demonstrated here will work on the final version of the product.

Quite often I have discussed the impossibility of joining data from a database with that of the file system. I discussed it with my team members, mentioned it in trainings and wrote about it recently in this article.

Being able to join data from a database with other data from a file system could be very useful though.

Introduction

Imagine a web based content management system that allows a user to enter articles. Together with the article, she can upload one more images that are presented with the image. So far so good. However, when the article is deleted from the database, the images should also have been deleted. Unfortunately, quite often a developer forgets to add this functionality to the system.

I have seen it many times before in old web applications that we're maintaining. We have huge folders with images, probably half of which are orphaned. I often joked that it would be nice to do something like this:

DELETE FROM C:\Uploads\Images WHERE File.Name NOT IN (SELECT FileName FROM Attachments) 

where C:\Uploads\Images is the folder containing the uploaded images and Attachments is a database table with a column called FileName that holds the filename of an image, like NewsArticle123.jpg.

Obviously, this doesn't work as the file system doesn't understand database parlance, while at the same time the database doesn't understand what a file is, or how the file system works.

For some time, I was hoping that WinFS would enable this kind of querying, but I guess we'll have to wait a bit more to find out.

But as soon as I saw Linq and some of its database and collection capabilities, I figured it should be possible to get real close to my final syntax. With the help of a single Linq to SQL file, some drag and drop operations and a number of lines of code, I was able to accomplish what I wanted: an easy way to delete orphaned images. Here's what I did:

  1. Added a Linq to SQL file to my .NET 3.5 Windows Application.

  2. From the Server Explorer, dragged the Attachments table onto the Object Relational Designer.

  3. Called the following code from a Button's Click event:
public void DeleteOrphans()
{
  string imagePath = @"C:\Uploads\Images";
  FileInfo[] myFiles = new DirectoryInfo(imagePath).GetFiles();

  MyAttachmentsDataContext db = new MyAttachmentsDataContext();

  var orphanedFiles =
      from o in myFiles
      join a in db.Attachments on o.Name equals a.FileName into theFiles
      where !db.Attachments.Any(p => p.FileName == o.Name)
      select o;

  foreach (var orphanedFile in orphanedFiles)
  {
    orphanedFile.Delete();
  }
}

The first two lines of the code get an array of FileInfo objects with all the files in the C:\Uploads\Images folder. Next, these files are joined with the result of the Attachments table with a where clause that filters out all files that have a matching Attachment record in the database. Once the query is fired in the foreach block, the resulting orphanedFiles is a collection of FileInfo instances that don't have an associated record in the Attachments table and can thus be deleted.

Notice how the type inference system did a good job in determining the type of the items in the collection. Although I didn't explicitly determine a type for the orphanedFiles collection, the compiler knows that it will contain instances of FileInfo, so I can call the Delete method without an additional cast.

This is my first stab at solving this problem with Linq, so there may be better or easier ways to do it and get even closer to my initial code example. I also think it won't scale very well to folders with thousands of files. However, it's a nice example and was fun to figure out. It also shows the great potential of the upcoming Linq technologies.


Where to Next?

Wonder where to go next? You can post a comment on this article.

Doc ID 424
Full URL https://imar.spaanjaars.com/424/fun-with-linq
Short cut https://imar.spaanjaars.com/424/
Written by Imar Spaanjaars
Date Posted 03/16/2007 17:30

Comments

Talk Back! Comment on Imar.Spaanjaars.Com

I am interested in what you have to say about this article. Feel free to post any comments, remarks or questions you may have about this article. The Talk Back feature is not meant for technical questions that are not directly related to this article. So, a post like "Hey, can you tell me how I can upload files to a MySQL database in PHP?" is likely to be removed. Also spam and unrealistic job offers will be deleted immediately.

When you post a comment, you have to provide your name and the comment. Your e-mail address is optional and you only need to provide it if you want me to contact you. It will not be displayed along with your comment. I got sick and tired of the comment spam I was receiving, so I have protected this page with a simple calculation exercise. This means that if you want to leave a comment, you'll need to complete the calculation before you hit the Post Comment button.

If you want to object to a comment made by another visitor, be sure to contact me and I'll look into it ASAP. Don't forget to mention the page link, or the Doc ID of the document.

(Plain text only; no HTML or code that looks like HTML or XML. In other words, don't use < and >. Also no links allowed.