Talk:Linked list
|
|
| Missing image Cscr-former.png Former featured article candidate | This article is a former featured article candidate. Please view its sub-page to see why the nomination failed. For older candidates (where the individual nomination does not exist) please check the archive. Once the objections have been met you may resubmit the article for featured article status. |
how can you distinguish the pointer "next" and "prev" ?
Good job, Dcoetzee! Making trashy looking pics always make people do nice ones. :) You might also wanna check In-order traversal if you have time. BL 06:52, Jan 23, 2004 (UTC)
Hmmm, the last code example is much too obscure, and also uses a singly-linked list, as opposed to what is shown. I'll fix this eventually if no one else gets to it first.
Derrick Coetzee 00:57, 8 Mar 2004 (UTC)
- I should say that causes potential problem, especially when visiting the previous node. i will try to fix it. --Yacht 01:31, Mar 8, 2004 (UTC)
- This is good, although in my opinion the code could be made simpler, eliminating some of the system calls that might appear strange to a C programmer. The pointer manipulations also appear to come out of nowhere for someone inexperienced with linked lists. I think I'll insert pseudocode for basic manipulations of a few types of linked lists.
- Derrick Coetzee 01:31, 11 Mar 2004 (UTC)
- "The pointer manipulations also appear to come out of nowhere for someone inexperienced with linked lists", i don't know. but i think it would be better to initialize a pointer to NULL. security reason... pseudocode is good, but need to make note that it's unexecutable. --Yacht 04:24, Mar 11, 2004 (UTC)
- I think it's good to be unambiguous in pseudocode; I've seen several use the := notation. An alternative is <-, but that's less familiar.
- Derrick Coetzee 20:41, 11 Mar 2004 (UTC)
The singly-linked list figure is somewhat misleading, since it suggests that the last element contains the address of a special node (rather than containing a special address that points to no node).Jorge Stolfi 20:08, 23 Mar 2004 (UTC)
- The difference is really an implementation detail; it could be implemented using a sentinel node, instead. I chose this notation because I've seen it in books, and it makes it clear that the field still is a pointer, it just doesn't point anywhere interesting.
- I would like to modify the doubly-linked list diagram, though, so that the pointers don't appear to be pointing to cells inside the nodes, but to the nodes as a whole, as far as this is possible. If you imagine the next and prev pointers are literally pointing to each other, you can imagine the confusion.
- Derrick Coetzee 13:58, 24 Mar 2004 (UTC)
I think there's nothing about the main algorithms description that prevents it from being generic, since the data can in fact be a pointer or, as pointed out, be determined by template parameters or parametric polymorphism. The code in this part is also entirely in C, which is inconsistent. I'm going to erase most of it, since it's mostly redundant - I hope there are no objections. Derrick Coetzee 13:34, 27 Aug 2004 (UTC)
- The explanations here are also extremely C-specific ("a car is a void pointer"), and in many cases explains how to do things that are a bad idea, such as indexes by the first letter. If you start doing this sort of thing you might as well use a trie. Also, even in C, it's possible to use macros to allow a generic linked list data structure with internal storage; I know, I've done it. The advertising links to one particular author's linked list implementation are also offensive to me.
- I will incorporate a discussion of internal versus external storage, even if I'm repeating content from reference, and I'll talk about putting single data items in multiple lists, but these are the only things I'll be taking away from this added content.
- On another note, I think the new C linked list example may be longer than necessary, but it isn't bad or out of context, so I'm leaving it alone. Derrick Coetzee 13:54, 27 Aug 2004 (UTC)
I don't want to sound defensive, but the reason I added the generic linked list is that I've seen tons of examples of the first kind of linked list, but seldom have seen of the kind I added. Most of thI don't recall any of the text books on data structures that I've seen that method. (If you know of one, let me know ... I would consider it for my course.) I would really like to see the alternative method available someplace. Comments?
I didn't think a link to my generic linked list package would be offensive, nor did I think of it as an "ad". I don't make any money from it, and the package is under a creative commons license, and I encourage my students to use it rather than write their own. I thought that if somebody wanted a ready-made package, I would point them to it. It's not that I think it is a great package, but it works, so I keep using it. If I had to do it again I would do it differently (but that's almost always true ;^)
I'm new to wiki ... what are the guidelines to referencing external sites that contain more info?
Building an index based on the first letter is a good (IMHO) way to partition a linked list so that you can start your search closer to your target. Depending on the size of the list, the index might be for the first few letters. Partial indecies are used lots of places if the entries are in a sorted order, even in non-linked list applications. I recall a program I worked with that used partial indecies to access offset information on waypoints and airways that were stored in various files.
BTW - a skip list originally is built by linking every "n" nodes together, but as soon as the actual list is modified, that relationship is broken. Finding the 'nth' node in a linked list probably shouldn't be done with a skip list. Even if the list is static, you have to watch out for the end of the list, unless the spacing of each skip level is a modulo of the list size.
If I need direct access to a linked list, what I have done sometimes is to ingest information and insert it into a linked list. I then allocate a dynamic array of pointers, and then loop through the list setting the address in the array. I can then use the array to access the 'nth' node. From then. on, the list must be stable (or at least you have to handle any insertions/deletions) If the list is especially large, I will use an array of linked list (nodes) based on a partial index (first 'n' letters), maintain the size of each list, and then sequentially search the list that contains the element (by adding up the sizes of the sublists until I exceed 'n').
wrp103 (Bill Pringle) | Talk]]
- I'm sorry if my changes seemed rude, and your efforst are appreciated, but really, I found your content primarily redundant and extremely specific to the C language. In many languages, such as C++ and ML, a generic linked list data structure can be written in precisely the way the existing description explains. Even in C, it's possible to write a generic linked list data structure in this way, either using internal storage together with macros, or using external storage - the original description did not exclude the data begin a generic reference to data, and so includes your description, although not written in C - unless there's some other difference I hadn't noted. Also, as I pointed out in the new sections, internal storage is not incompatible with placing objects in multiple linked lists either.
- As for your points about indices, they are certainly useful in a variety of contexts, and I my above points about them may have been in error, but they're not intrinsically related to linked lists, and so probably belong in some article about indexes. All data structures can have indexes constructed on them. A link somewhere may be appropriate.
- Your points about skip lists are accurate and appreciated; please edit, but keep it terse, as the main article is skip list.
- As for the link, this is a contentious point. I follow a policy of including only highly authoritative or popular external links, or article references (in their own section). Some are more flexible, but in this case I assure you that many, many people have written packages identical to yours - I have one on my own website under a less restrictive license. Many of them are used a lot more widely. Wikipedia used to contain links to a site called GreatSnakes containing algorithm descriptions and code samples, which I erased also, because it was a brand new site. I think we should, like an algorithms book, provide the simplest possible description, and leave raw implementations to Google and repositories like SourceForge.
- This is just my own view though, and there has been no consensus on this matter. I imagine others might see it as personal advertising for exposure though. Derrick Coetzee 02:13, 28 Aug 2004 (UTC)
- Don't worry, you have to try a lot harder to offend me. ;^)
- I see (and agree with) most of your points. I'm used to doing examples in C because (a) I like C, and (b) where I teach, the assumption is that all the students know C (which is taught in the "intro" course for those without programming backgrounds) and anyone with programming experience has most likely run across something like C, C++, Java, etc.
- I teach a course on data structures and algorithms, and have been very frustrated with what text books are available, that have poor examples and/or even worse code samples. This has been especially true for sections on linked lists, which always use examples of internal storage, and never even mention how to use external storage. If anyone has suggestions, please let me know.
- As for personal promotion, I would like to think that wasn't a motive. (but I would be the last to know! ;^)
- wrp103 (Bill Pringle) | Talk]] 12:53, 28 Aug 2004 (UTC)
| Contents |
Linked list using an array
I am thinking of adding a linked list varaint that uses an array instead of next and prev pointers. It has some minor advantages but is probably not used widely. Since the article is long and is in good shape, I want to know what you think first. (I am suspecting there might be someone who would say such implementation is unpopular thus somehow irrelevant.) -- Taku 13:54, Aug 28, 2004 (UTC)
- Can you describe the variant here first? I'm not familiar with it. Derrick Coetzee 17:37, 28 Aug 2004 (UTC)
- If you are talking about using array indecies instead of pointers, that is the only way to build a linked list when working with a language that doesn't support pointers. The very first linked list I ever wrote was back in the '60s using FORTRAN!
- wrp103 (Bill Pringle) | Talk]] 19:00, 28 Aug 2004 (UTC)
Linked list record
While having a record that contains the head and tail pointers for a linked list is useful, I would claim that this is obvious, and that its discussion is too C-centric and in the wrong section. Also, using a current pointer inside such a record is a poor design, because it makes it dangerous to iterate through the list twice at one time, which can occur even in two seemingly unrelated pieces of code operating on the same list. For now I cut all this. Derrick Coetzee 18:21, 23 Sep 2004 (UTC)
- I didn't notice this when you did it. I don't agree that it is obvious, especially to a beginner. I am not aware of any text books that mention a linked list data structure. Most of the texts that I have used don't even mention external linked lists, but rather include links within some existing data structure.
- Maybe you're right, but if we're going to mention the idea of encapsulating the abstract data structure in this way it should be done higher up where the representations of various types of linked lists and their operations are first introduced.
- About external linked lists: linked lists with external storage are a special case of internal linked list, in that their data is a reference. The only other change this involves is deallocating data when nodes are destroyed in languages without garbage collection. Therefore this material is in a very formal sense redundant. It's useful to mention the tradeoffs, but they're not specific to linked lists (they were already discussed in reference). The widely-held belief that linked lists with internal storage cannot be made generic is a fallacy, even in C, as I explain in the article. I also think the code examples are probably a bit too long and have too many comments (textbooks never comment every single line; comments should serve to summarize and explain at a high level, but this is a matter of opinion.) I leave the section in as a simple example of the general tradeoffs between internal and external storage. Derrick Coetzee 19:16, 6 Oct 2004 (UTC)
- As for the current link, that is the only way I can think of to implement a generic iterator, which I mentioned in the text. (Yes, the client program can store the current link in local memory, but I'm not talking about that.) It also allows you to create the concept of a recordset cursor. If you have two tasks accessing the list, then they could each have their own LinkList data structure, since it simply contains links to the head and tail nodes (in addition to current). This could cause problems if new head or tail items are inserted, but using dummy/sentinel head & tail elements should help that.
- An iterator for an ordinary linked list need only store a reference to the current node. The record describing the linked list itself should contain a head and tail pointer, but not a current pointer, as this implies there is a single current node for the list at any one time. I think somewhere higher up something should be added like this:
- I don't agree, but that is why there are more than two flavors of ice cream. ;^) Consider implementing an iterator for a circular linked list. When to you know you are done? When the current node is the tail node. Multiple tasks modifying the same linked list raises all kinds of concurrency problems, especially if it is a double linked list. With a single linked list, you just change the next pointer of the previous cell after you have set up the new node, and there shouldn't be a problem, assuming the assignment is an atomic operation. However, with a double linked list, you better have some sort of locking mechanism in place. Because of that, I assumed in my LinkList package that if somebody was iterating through the list, it was a single task, so the only thing that changed was the current pointer. Using that approach, the client program only has one thing to keep track of ... the LinkList pointer, which contains head, tail, and current. Again, it is a matter of preference, and hopefully each person implements whatever they are comfortable with. I think my approach is easier for a beginner to use, which is what most of my students are. wrp103 (Bill Pringle) - Talk 20:49, 6 Oct 2004 (UTC)
- Part of the point a circular linked list is that you can start iterating around the complete list from any point — you just need two iterators and the ability to compare them for equality. They have no head or tail. In applications with additional constraints like concurrency you just add information to the iterator, such as a reference to the list's lock. Even in single-threaded applications, iterators are important for reentrancy; the simplest example is building a loop over all pairs of elements from the list. A more subtle and error-prone example is passing a list to a function which iterates over it while iterating over it. Separating the list and its iterators isn't difficult, even for beginners; I'd argue it also helps better understand and isolate the concepts behind an abstract list. Derrick Coetzee 21:05, 6 Oct 2004 (UTC)
- Er, that said, if you want to insert or delete nodes passing just the iterator, then it does require a reference to the list itself. Note that it does not suffice to add head and tail to the iterator's record, because then updates to head and tail through one iterator aren't reflected in other active iterators. Derrick Coetzee 21:08, 6 Oct 2004 (UTC)
- I don't agree, but that is why there are more than two flavors of ice cream. ;^) Consider implementing an iterator for a circular linked list. When to you know you are done? When the current node is the tail node. Multiple tasks modifying the same linked list raises all kinds of concurrency problems, especially if it is a double linked list. With a single linked list, you just change the next pointer of the previous cell after you have set up the new node, and there shouldn't be a problem, assuming the assignment is an atomic operation. However, with a double linked list, you better have some sort of locking mechanism in place. Because of that, I assumed in my LinkList package that if somebody was iterating through the list, it was a single task, so the only thing that changed was the current pointer. Using that approach, the client program only has one thing to keep track of ... the LinkList pointer, which contains head, tail, and current. Again, it is a matter of preference, and hopefully each person implements whatever they are comfortable with. I think my approach is easier for a beginner to use, which is what most of my students are. wrp103 (Bill Pringle) - Talk 20:49, 6 Oct 2004 (UTC)
- An iterator for an ordinary linked list need only store a reference to the current node. The record describing the linked list itself should contain a head and tail pointer, but not a current pointer, as this implies there is a single current node for the list at any one time. I think somewhere higher up something should be added like this:
- Linked lists implement the abstract data stucture of a list with sequential access. It's useful to hide the implementation details of the linked list behind two records that represent abstract concepts applying to all data structures implementing a sequential access list, namely the list itself, and an iterator, which indicates a position in the list. These are particularly simple for linked lists:
record list {
node head // Reference to first node in list
node tail // For doubly-linked lists, last node in list
}
record iterator {
node current // Reference to node at current position
}
- For other sequential list data structures, these records might contain different data.
- If you know of any texts that do mention linked list structures (and/or discuss internal/external design) I would like to know about them. While I'm at it, can you recommend any text books for Programming Languages Concepts (http://www.personal.psu.edu/faculty/w/r/wrp103/class/cse428/index.html)? I'm not happy with my current one, and wouldn't mind finding a better text. wrp103 (Bill Pringle) - Talk 18:44, 6 Oct 2004 (UTC)
- A pretty thorough discussion of linked lists is given in CLRS (Introduction to Algorithms, ISBN 0070131511), pages 204-208, which is a very good and very popular textbook in general. As for programming language concepts, in my opinion there's no substitute for actually using languages in several paradigms to accomplish nontrivial tasks — but the web page for a course at my university (http://www.cc.gatech.edu/classes/AY2004/cs6390_fall/) offers a number of good references. Derrick Coetzee 19:16, 6 Oct 2004 (UTC)
main function in C
There is no problem with declaring main to take no arguments: it is legal according to both C89 and C99, and is also the usage in the C article itself. The argv and argc arguments would be unused in the function if they were added, so I think that keeping them in the example doesn't add anything. Since it's not illegal to declare main as taking no arguments, I don't see any reason why we shouldn't do it. Neilc 01:17, 29 Nov 2004 (UTC)
- I guess then it's ok. Either way, main (void) or main (int argc, char *argv[]), it's illegal in ancient C anyhow. -- Taku 04:48, Nov 30, 2004 (UTC)
Referencing linked lists
I don't think the problem of how to pass a linked list to a procedure is limited to C (though I could be wrong) -- it seems to me that any language which does not treat the list as a built-in datatype is going to have lists referenced by their head and/or tail elements, and will sometimes run into the tricky problem of having to double-dereference those elements if the operation could change the identity of the head/tail elements. Since this is a problem common to more node-based data structures than just the linked list, however, perhaps it would be best to address it in data structure and mention it briefly here? -- Antaeus Feldspar 16:59, 26 Dec 2004 (UTC)
- Well, the issue is not limited to C, per se, but only in C does it seem that programmers are constantly reinventing the wheel when it comes to datatypes instead of creating abstractions (this issue would never arise in a C++ program using the STL, for example). A better way to do it is to embed the head and/or tail into an opaque data structure representing the list itself and pass a reference to this around. If it must go somewhere though, I would choose reference (computer science); data structure is far too general. Deco 09:41, 28 Dec 2004 (UTC)
