Tom Kyte: Geek of the Week

Tom Kyte's contribution to the AskTom column and site over ten years has been outstanding. Much of what he says has relevance to all relational databases. His views are straightforward, the discussions he provokes are lively: Not only does he know a frightening amount about both Oracle and SQL Server, but he's also refreshing to listen to.

1071-Kyte.JPG Tom Kyte is a Senior Technical Architect in Oracle’s Server Technology Division. Before starting at Oracle, he worked as a systems integrator building large-scale, heterogeneous databases and applications, mostly for military and government customers.

“You have to be constantly
evolving and in some cases
DBAs don’t do that because
they know how they did it
years ago and they want to
keep doing it that way…”

Tom spends a large amount of his time working with the Oracle database and, more specifically, working with people who are working with the Oracle database.

Tom is also the ‘Tom’ behind the AskTom column in Oracle Magazine, answering people’s questions about the Oracle database and its tools (http://asktom.oracle.com/) and is also the author of Expert Oracle Database Architecture (Apress, 2005), Expert One on One Oracle (Wrox Press, 2001/Apress 2004), Beginning Oracle Programing (Wrox press, 2002/Apress 2004), and Effective Oracle by Design (Oracle Press, 2003).

These are books about the general use of the database and how to develop successful Oracle applications.  He lives in Round Hill, Virginia.

RM:
Tom, AskTom was 10 years old earlier this month, why do you still do it, what keeps it interesting for you and did you have a plan when you started to be answering readers queries a decade later?
TK:
Did I have a plan? No, not really. AskTom happened entirely by accident.  I used to answer questions on the Internet Usenet newsgroups and had been doing that for about six years since 1994. I answered a question on the newsgroups and someone followed up my answer with a suggestion to update the SYS tables (the true dictionary).

I pointed out that doing that would be a big “no no”, something to not even consider and asked them where they had heard to do that.  It turned out that they’d it from Oracle Magazine via a user supplied tip! I got in touch with Oracle Magazine and as it happened they didn’t have a technical editor for the user tips so I volunteered to do it and that started the column in the magazine. The AskTom website started about five months later.

What do I get out of it? Well, I like seeing people come and go over time, some of the people who used to ask me questions ten years ago are now the leaders in the Oracle community. So it’s fulfilling to see people come, learn, see how they do things correctly and then go out and replay that knowledge themselves.

RM:
Moving on to where we are now, both in terms of advances in hardware as well as the Oracle RDBMS, what has this meant for the type of problems you encounter on the site, that people bring to you?
TK:
None whatsoever, it just means people encounter the problems even faster.  So they can compile and develop faster. The human brain isn’t any faster today than it was 10 years ago; we still need the same amount of time to design. The basic problem is people don’t fully understand how things work and then make bad decisions based on how they think it should work.
RM:
What are the biggest lessons you feel you’ve learnt from the Oracle community in 10 years?
TK:
That I don’t know everything. Each day I learn something new about Oracle from the questions I get, or seeing somebody else follow up with a ‘but why didn’t you tell them this?’ and my answer would be because I had no idea that existed yet.
RM:
Are there issues that crop up again and again?
TK:
Perhaps the biggest issue is the black box approach of development. A developer will learn everything they can about the procedural language they’re using. However, they don’t learn about the database that they’re using or other packages that might be involved. My premise is they’re not building an application so much as something that manipulates data and the data is going to outlive the application. The data might already exist; it might even pre-date the application.  The data is important, so foregoing knowledge of what I view as the key, crucial, critical component of the application – the database storing the data, is short-sighted and the number one cause of performance issues. The typical mindset is we’ll just get a framework and say ‘oh the framework will hide all that nasty SQL, I don’t have to know anything about it, I’ll just push the transactional button and everything will be good.’

The other ones are anti-design patter that people pick up, such as lack of bind variables. You would think that knowledge would have got across, yet developers, for whatever reason are not being taught this stuff.

RM:
Do you think then that poor education is to blame? That somehow it’s got worse over the years rather than getting better?
TK:
No, it hasn’t changed. When I get up on stage at a seminar and I talk about bind variables I start by saying that for 16 years I’ve been talking about the same thing but each year the problem is the same. Why? Because universities are trying to teach students theory and algorithms and things like that, they’re not teaching them how to write production quality code. They don’t teach them how to debug or how to instrument, they don’t teach them how to defensively programme. They just teach them how to write a compiler in Lisp which frankly doesn’t translate very well into IT.
RM:
Has it always been the same?
TK:
It has been that way forever because it’s called computer science. They are turning out computer scientists with lots of theory and little practice.

People don’t take the MIS, IT, data processing courses because they sound very boring alongside computer science. When I went to college I took mathematics, focusing on abstract algebra. I never saw any numbers, it was proofs and theorems, things I don’t use today in real life, but it was teaching a science. It was useful to me, don’t get me wrong, but I didn’t really have an out of the box skill I could use in the workforce. I had to be trained from scratch.

Now computer science students are learning things like how to build a compiler – that doesn’t help anybody in IT because they’re not building compilers. I’m not saying it’s useless, it’s just they’re not being taught what they need to do in a particular job. Then employers hire them assuming that because they’re computer scientists they must be really good programmers but they’ve never been taught to program. That is where the problem begins.  

RM:
Okay, let’s move on to another question concerning database triggers. Is there a need for them? Should they be banned? Do you think triggers should be banned?
TK:
If a well informed person who understands how to use them and understands concurrency, multiple users doing things, uses them, triggers are awesome.  But the vast amount of people using triggers are not that well informed they don’t understand the side effects and they definitely don’t think in terms of multiple users.

When that developer uses a trigger they’re using a very dangerous thing because they don’t realise what they’ve done and how many ways that they can break. I have three features that are so misused that I think the world would be a better if we did not have them and triggers is one of them.

RM:
What are the other two?
TK:
The generic catch all exception – ‘when others’. Many developers believe that if their code raises an error it makes them look bad so they’ll do ‘when others then null’ and that just swallow, hide the exception. Nobody knows there is a failure and that is perhaps the number one cause of developer bugs that I see.  Whenever somebody posts code on AskTom that has ‘when others then null’ or ‘when others then db’ I just cut and paste that part of their code and then I put in a link to a search string on AskTom saying “I hate your code” in quotes to point to all the other examples I’ve written about.

The third issue is the autonomous transaction which allows people to do things such as commit in a trigger because they can do that in SQL Server.

SQL Server triggers work differently than Oracle triggers and, committing in a trigger is a horribly bad idea. I believe it’s horribly bad in SQL Server, but it’s doubly horribly bad in Oracle.

Many people inappropriately use an autonomous transaction all over the place to get around what they think are limitations, but when I look at that code and I explain to them what’s going on they don’t even realise the transactional integrity issues that they’ve introduced into their application.

You know if you commit in a trigger that means you commit that work. But they don’t realize it only commits the work that was done in the trigger, not the statement that caused the trigger to fire. Think about what happens then when the user rolls back? The stuff you did in your trigger doesn’t roll back with you, it’s non-transactional, you’ve already committed it and now the database is sort of left in this inconsistent state.

RM:
I suppose really that comes onto the next question about what kind of mistakes and assumptions do you see people who are new to Oracle making who come from a SQL Server background?
TK:
It’s the thought that a database is a database, that they’re black boxes, they all work the same and they don’t. I can almost always tell when a SQL Server developer is coming to AskTom for the first time based on the questions that they ask. You know, a very popular question I get is “how do I create a table in a store procedure?”  I refuse to answer that question; rather I ask them “why”, “why do you want to do that?”
RM:
Why do you answer in that way?
TK:
Because the answer is always well I’m a SQL Server developer, I always do “SELECT  INTO #tmp_table” and that created a temporary table for me.  The way temporary tables work in Oracle is different to SQL Server. It’ll be like a Windows operating system level programmer, thinking they can write device drivers on Linux because they can write device drivers on Windows without having to learn anything because it’s a device driver – they must all be the same. They’d fail miserably because the Linux architecture is radically different from Windows. So while they might have a lot of knowledge writing device drivers, it doesn’t mean they can write a Linux device driver right away.
RM:
Okay, if we go on to third party tools, are there now more opportunities for third party developers to work with Oracle?  Is this Oracle’s strategy and is it working?
TK:
It’s the third party tools that have worked to make Oracle’s tools better, we found that the Quests and the SQL Navigators had better IDEs and a great development environment than we did and that’s what drove us to produce something like SQL Developer.  If there hadn’t been TOAD, for example, there might not have been a SQL Developer. 

As far as I’m concerned we’ve always promoted third party tools, add-ons, especially things that filled the gap that we did not have, and what you’ll see sometimes is we’ll actually appreciate those tools so much we go out and buy them. For example GoldenGate replication technology that’s now part of the Oracle framework, filled a gap which we did not have – heterogeneous replication, replication from Oracle to DB2 and vice versa. So those third party tools are there, they exist, we’ve always promoted them.

RM:
With SQL Server, there comes a time when you think “this would be a really useful, feature, if only they had bothered to finish it”. Text and image data types spring to mind, along with extended properties. Are there any equivalents in Oracle?
TK:
From a database perspective, not really.  The only one I can come up with is sort of peripheral to the database which is I’m a command line person.  SQL Plus is a tool that we have that does command line stuff, but SQL Plus is definitely the focus this year and for it to continue to advance that is something I would really appreciate seeing but it hasn’t changed very much in the last 10 years, incremental things, but nothing very large. 
RM:
Which aspects of working with Oracle get you really excited?
TK:
I like the investigation part, being given a problem and then working out the solution – using the wealth of information we have today makes it much easier than in the past and the fact that a new release comes out approximately every 18 months, that’s really interesting as well because there’s always new capabilities rolling out.  I remember when Version 7.3 came out in the mid 1990’s and I looked at the feature list and I said “what more can you put into a database?”.  And yet a year and a half later they came out with Version 8 and 9 and 10 and 11 and I can’t imagine going back and using Version 7.3 anymore, it would be like stepping back into the Stone Age.  The new capabilities that come out keep it very interesting.
RM:
What’s your take on where Oracle RDBMS is going to go over the next five years and what are the main challenges that are being addressed do you think?
TK:
You’re going to see a lot of things to do with scaling up large, as well as automation.  That was a move that started mostly in 9i and above as far as automating the things DBAs have to do over and over again.  Ten years ago the DBA would have to configure rollback segments and figure out how to size them and how many there should be and 9i comes out and we’ve got it down to basically one parameter for rollback. It’s now called undo and the DBA sets a parameter and we figure out how big they are, how many we need and so on, based on the workload.

The other one is what I call big scaling, the capability to go beyond 100’s of terabytes.  It was only 13 years ago that we put out a press release saying we could successfully load and index and query a terabyte of data. Today, a terabyte of data is nothing, it’s a small database, so we’re looking at tens to hundreds of terabytes being the norm; not the exception.  With the price of disk coming down as it has and the ability to get a terabyte of storage for $100, the whole situation has changed. 

People don’t use delete anymore, they only ever use insert and update and the databases are growing, the capabilities to be able to manage that volume of information, that’s another large focus area.

RM:
What do you think is going to be the most immediate impact on the life of an Oracle DBA as a result of the evolution of the platform?
TK:
That’s the key word, evolution. You have to be constantly evolving and in some cases DBAs don’t do that because they know how they did it years ago and they want to keep doing it that way.  Things have changed.  If a DBA doesn’t keep up with the technology it will pass them by.
RM:
Okay, if you could choose any two features to have from Oracle’s 11g large upgrade list what would you choose and why?
TK:
Well the first one would be the ability to do the real application testing throughout.  With real application testing you have the ability to test in private, without embarrassing yourself, to try things out and measure what it would have done in production if you had made that change.

There is a new Oracle Database 11g Release 2 feature that allows me to upgrade my applications online. I found this to be such an interesting feature that I wrote about this new capability in some length in Oracle Magazine.          

RM:
Okay, it’s probably a very short answer to the next one.  Is there any feature with the PL on SQL or Oracle that makes you want to smash things?
TK:
No, there’s features that I’d like to have added but there’s no construct in PL SQL that makes me want to smash my fingers, I don’t think it’s too verbose, I don’t think it’s too terse.

I was an Ada programmer for quite a while in the 1980s until the 1990s, so the language looks familiar. They took out some of the things I didn’t like in Ada and I think you have a pretty decent language right there. They could do arrays better, I would love to have a true two or three or four dimensional array type. But from a programming construct that would probably be the first thing I’d want added and I’d have to think long and hard what the next bit would be. 

RM:
Do you think too many IT managers see projects purely in terms of the client app and put the whole design in the app developer’s hands? Is this the root cause of poor database design and a lot of the issues that plague applications time and again?
TK:
Yes absolutely.  This is their biggest problem and this is the biggest problem. An application developer who maybe has three years of experience of writing code has had some successes then they’re put in charge of a project and the project goes way beyond their capabilities, they don’t know that yet.  And so it happens throughout history forever, right?  It’s the same old problem but eventually they do fail and 10 years from now they’ll be the ones railing against their younger developers saying no, no, no, that’s not the right way to do it and then they get called old and stuffy.
RM:
Do you think that’s the most dangerous outcome?  That I suppose that perceived knowledge is held on to and they are not able to accept anybody else’s view?
TK:
I think it’s true of every new excited person in any sort of working area. It’s amazing for me to look at how much I didn’t know when I was 28.  But if you had asked me at 28 how much I didn’t know I would have said I think I pretty much know it all, right, so that’s just something that comes with experience.

But there is a lot of danger in these things in terms of purely the client application, because there’s applications come and there’s applications go, applications have a very short lifespan, even shorter today than they used to.  How many applications do you use today that you were using 10 years ago?  And I’m talking IT applications, not Microsoft Word, not a web browser.

In SQL Server applications, chaos indexing seems to be a perennial problem and it seems to be the same in Oracle.  Why is indexing such a hard topic for most developers to tackle?

TK:
Probably because if you ask a developer they would say indexing is not my job, that’s something the DBA will take care of it. 
RM:
Yeah, sure.
TK:
The DBA is ill suited to come up with an indexing scheme because they don’t understand how the application uses the data. So if you ask me, I believe it is the job of the developer to issue the create table, create index and use the storage structures appropriately. However, in real life most people follow the structure that the DBA does all of that, anything that starts with create is a DBA task and the developers use what they create. So there’s this split between these two groups and they don’t always work together as closely as they should.

But that is why you see a lot of tools that will go out there and analyse your SQL on a production system and advise indexes. So we have those and many others have implemented those as an attempt to sort of…as a panacea if you will to try and solve this problem. But they don’t really solve the problem because nobody has sat down and really thought about the indexing strategy and that goes back to the previous bullet point which is you’ve really got to think about the data and how it will be used, how it will be accessed so that you can utilise the optimal structure for that. Make sure that the database organises the data in a way that’s efficient for you to retrieve that data.

RM:
Talking about Oracle a moment, do you think there’s anything they should be doing more to champion the cause of the development DBA?
TK:
I’m not really sure how to answer that because a lot of people haven’t even recognised the role of a development DBA.

They might have in mind production DBAs and developers and in many places the developers are playing the role of the development DBA and the development DBA knows all the capabilities and information that production DBA has, they just need to be a little bit more flexible in their environment really.

The company won’t shut down if something really bad goes wrong in development, so they’re more flexible, able to try things but I don’t see a lot of “true development DBAs”, I see a lot of developers pretending to be one.  If some have the development DBA then we have a lot of what they need, they’ve got the forums, there’s education, Oracle itself does seminars and whatnot, for free, come in, we’ll show you what to do. 

RM:
Who would you say has taught you most about Oracle? Your Zen master, if you like?
TK:
I don’t think there was any one individual.  When I started programming, I had a mentor who taught me how to program, how to instrument code, how to program defensively, how to be able to effectively debug your code even if you couldn’t run the debugger on it, how to test multiple approaches, how to evaluate what was the best way to do something under a given set of circumstances, how to do things slowly.  He was very methodical but he taught me how to program and I’ve carried that throughout my career and have made use of his approaches for the last 23, 24 years. From an Oracle perspective, it’s no one individual, I have learned things from Jonathan Lewis, I have learned things from Cary Millsap, I’ve learned things from every single user who ever comes to AskTom because they ask me a question that I don’t know the answer to yet and they force me to learn something that I didn’t know.