- Lori MacVittie, senior technical marketing manager at F5 Networks (www.f5.com), says:
While cloud computing has brought to the forefront of our attention the ability to scale through duplication, i.e. horizontal scaling or “scale out” strategies, this strategy tends to run into challenges the deeper into the application architecture you go. Working well at the web and application tiers, a duplicative strategy tends to fall on its face when applied to the database tier.
Concerns over consistency abound, with many simply choosing to throw out the concept of consistency and adopting instead an “eventually consistent” stance in which it is assumed that data in a distributed database system will eventually become consistent and cause minimal disruption to application and business processes.
Some argue that eventual consistency is not “good enough” and cite additional concerns with respect to the failure of such strategies to adequately address failures. Thus there are a number of vendors, open source groups, and pundits who spend time attempting to address both components. The result is database load balancing solutions.
For the most part such solutions are effective. They leverage master-slave deployments – typically used to address failure and which can automatically replicate data between instances (with varying levels of success when distributed across the Internet) – and attempt to intelligently distribute SQL-bound queries across two or more database systems. The most successful of these architectures is the read-write separation strategy, in which all SQL transactions deemed “read-only” are routed to one database while all “write” focused transactions are distributed to another. Such foundational separation allows for higher-layer architectures to be implemented, such as geographic based read distribution, in which read-only transactions are further distributed by geographically dispersed database instances, all of which act ultimately as “slaves” to the single, master database which processes all write-focused transactions. This results in an eventually consistent architecture, but one which manages to mitigate the disruptive aspects of eventually consistent architectures by ensuring the most important transactions – write operations – are, in fact, consistent.
Even so, there are issues, particularly with respect to security.
MEDIATION inside the APPLICATION TIERS
Generally speaking mediating solutions are a good thing – when they’re external to the application infrastructure itself, i.e. the traditional three tiers of an application. The problem with mediation inside the application tiers, particularly at the data layer, is the same for infrastructure as it is for software solutions: credential management.
See, databases maintain their own set of users, roles, and permissions. Even as applications have been able to move toward a more shared set of identity stores, databases have not. This is in part due to the nature of data security and the need for granular permission structures down to the cell, in some cases, and including transactional security that allows some to update, delete, or insert while others may be granted a different subset of permissions. But more difficult to overcome is the tight-coupling of identity to connection for databases. With web protocols like HTTP, identity is carried along at the protocol level. This means it can be transient across connections because it is often stuffed into an HTTP header via a cookie or stored server-side in a session – again, not tied to connection but to identifying information.
At the database layer, identity is tightly-coupled to the connection. The connection itself carries along the credentials with which it was opened.
This gives rise to problems for mediating solutions. Not just load balancers but software solutions such as ESB (enterprise service bus) and EII (enterprise information integration) styled solutions. Any device or software which attempts to aggregate database access for any purpose eventually runs into the same problem: credential management.
This is particularly challenging for load balancing when applied to databases.
LOAD BALANCING SQL
To understand the challenges with load balancing SQL you need to remember that there are essentially two models of load balancing: transport and application layer. At the transport layer, i.e. TCP, connections are only temporarily managed by the load balancing device. The initial connection is “caught” by the Load balancer and a decision is made based on transport layer variables where it should be directed. Thereafter, for the most part, there is no interaction at the load balancer with the connection, other than to forward it on to the previously selected node. At the application layer the load balancing device terminates the connection and interacts with every exchange. This affords the load balancing device the opportunity to inspect the actual data or application layer protocol metadata in order to determine where the request should be sent.
Load balancing SQL at the transport layer is less problematic than at the application layer, yet it is at the application layer that the most value is derived from database load balancing implementations. That’s because it is at the application layer where distribution based on “read” or “write” operations can be made. But to accomplish this requires that the SQL be inline, that is that the SQL being executed is actually included in the code and then executed via a connection to the database. If your application uses stored procedures, then this method will not work for you. It is important to note that many packaged enterprise applications rely upon stored procedures, and are thus not able to leverage load balancing as a scaling option. Depending on your app or how your organization has agreed to protect your data will determine which of these methods are used to access your databases. The use of inline SQL affords the developer greater freedom at the cost of security, increased programming(to prevent the inherent security risks), difficulty in optimizing data and indices to adapt to changes in volume of data, and deployment burdens. However there is lively debate on the values of both access methods and how to overcome the inherent risks. The OWASP group has identified the injection attacks as the easiest exploitation with the most damaging impact.
This also requires that the load balancing service parse MySQL or T-SQL (the Microsoft Transact Structured Query Language). Databases, of course, are designed to parse these string-based commands and are optimized to do so. Load balancing services are generally not designed to parse these languages and depending on the implementation of their underlying parsing capabilities, may actually incur significant performance penalties to do so.
Regardless of those issues, still there are an increasing number of organizations who view SQL load balancing as a means to achieve a more scalable data tier. Which brings us back to the challenge of managing credentials.
MANAGING CREDENTIALS
Many solutions attempt to address the issue of credential management by simply duplicating credentials locally; that is, they create a local identity store that can be used to authenticate requests against the database. Ostensibly the credentials match those in the database (or identity store used by the database such as can be configured for MSSQL) and are kept in sync. This obviously poses an operational challenge similar to that of any distributed system: synchronization and replication. Such processes are not easily (if at all) automated, and rarely is the same level of security and permissions available on the local identity store as are available in the database. What you generally end up with is a very loose “allow/deny” set of permissions on the load balancing device that actually open the door for exploitation as well as caching of credentials that can lead to unauthorized access to the data source.
This also leads to potential security risks from attempting to apply some of the same optimization techniques to SQL connections as is offered by application delivery solutions for TCP connections. For example, TCP multiplexing (sharing connections) is a common means of reusing web and application server connections to reduce latency (by eliminating the overhead associated with opening and closing TCP connections). Similar techniques at the database layer have been used by application servers for many years; connection pooling is not uncommon and is essentially duplicated at the application delivery tier through features like SQL multiplexing. Both connection pooling and SQL multiplexing incur security risks, as shared connections require shared credentials. So either every access to the database uses the same credentials (a significant negative when considering the loss of an audit trail) or we return to managing duplicate sets of credentials – one set at the application delivery tier and another at the database, which as noted earlier incurs additional management and security risks.
YOU CAN’T WIN FOR LOSING
Ultimately the decision to load balance SQL must be a combination of business and operational requirements. Many organizations successfully leverage load balancing of SQL as a means to achieve very high scale. Generally speaking the resulting solutions – such as those often touted by e-Bay - are based on sound architectural principles such as sharding and are designed as a strategic solution, not a tactical response to operational failures and they rarely involve inspection of inline SQL commands. Rather they are based on the ability to discern which database should be accessed given the function being invoked or type of data being accessed and then use a traditional database connection to connect to the appropriate database. This does not preclude the use of application delivery solutions as part of such an architecture, but rather indicates a need to collaborate across the various application delivery and infrastructure tiers to determine a strategy most likely to maintain high-availability, scalability, and security across the entire architecture.
Load balancing SQL can be an effective means of addressing database scalability, but it should be approached with an eye toward its potential impact on security and operational management.
Showing posts with label Database. Show all posts
Showing posts with label Database. Show all posts
Monday, April 9, 2012
Tuesday, October 4, 2011
Application Generators: A New Approach To Build Robust Database Applications Without Programming

- Brendon Ritz and Lauren Barraco, marketing and public relations for Iron Speed (http://www.ironspeed.com/), say:Today’s business environment demands managers to do more with less, and increasingly, IT managers, CIOs and software developers are turning to a new approach to build robust database applications without programming - application generators. Application generation allows both developers and non-developers to build database-driven applications quickly and efficiently and then deploy them to the Web, the Cloud, or to Microsoft SharePoint environments.
Similar to most applications, database applications are typically used for:
- Reporting and tracking – reporting, summarizing and visualizing data.
- Data entry and management – collecting and editing data from users.
- Business process automation – orchestrating data flow between multiple systems.
- Workflow and scheduling – automating step-wise business processes.
The tool also includes a wide variety of professionally-designed page styles that create visually-appealing Web applications without a graphic designer. You can customize these styles, or create your own to maintain your corporate look and feel. Many of the options available in Iron Speed and similar Web application builders would take weeks or even months to build by hand. Additionally, features such as reporting, data import and export and application security are not commonly included in most custom applications.
While traditional software developers may disregard application generators in preference to their own hand-written code, the time and costs savings of these tools are too significant to ignore. Even non-developers can build applications in just a few hours or days, rather than weeks and months- all without programming. Some of the key benefits of application generation include:
- Reduced software development costs. Applications can be developed and deployed faster and more efficiently with less cost in human resources. Proof-of-concept systems can be rapidly built, feedback gathered, and quickly modified.
- Consistent look and feel. Generated applications have highly consistent and professionally designed user interfaces, giving applications a finished look and feel, even at the prototype stage. A consistent look and feel across applications reduces end-user learning curves when assimilating new applications.
- Simplified application maintenance. Generated applications follow a highly consistent architecture, allowing any developer to maintain any application. There is little or no ‘ramp up’ time necessary for one developer to maintain another developer’s application because the architectural knowledge transfers from one application to another.
In saying that, application generators including Iron Speed Designer, can be difficult for the uninitiated. Our extensive help pages and multi-level training courses are designed to provide assistance for any degree of experience. Most other solutions have “a la carte” pricing, which in most cases, force the purchaser to pay for unneeded features. Before you jump knee-deep into your application solution, try out the Iron Speed Designer trial edition which includes all the features of the tool with a test database.
If you would like information on one of the most current application generation tools, you may want to visit http://www.ironspeed.com/, or other similar Web sites, that can further explain how these tools can help you increase your credibility and performance as a professional application developer.
Labels:
Application Performance,
Database
Sunday, February 27, 2011
What Do Database Connectivity Standards and the Pirate’s Code Have in Common?
An almost irrefutable fact of application design today is the need for a database, or at a minimum a data store – i.e. a place to store the data generated and manipulated by the application. A second reality is that despite the existence of database access “standards”, no two database solutions support exactly the same syntax and protocols.
Connectivity standards like JDBC and ODBC exist, yes, but like SQL they are variable, resulting in just slightly different enough implementations to effectively cause vendor lock-in at the database layer. You simply can’t take an application developed to use an Oracle database and point it at a Microsoft or IBM database and expect it to work. Life’s like that in the development world. Database connectivity “standards” are a lot like the pirate’s Code, described well by Captain Barbossa in Pirates of the Carribbean as “more what you’d call ‘guidelines’ than actual rules.”
It shouldn’t be a surprise, then, to see the rise of solutions that address this problem, especially in light of an increasing awareness of (in)compatibility at the database layer and its impact on interoperability, particularly as it relates to cloud computing . Forrester Analyst Noel Yuhanna recently penned a report on what is being called Database Compatibility Layers (DCL). The focus of DCL at the moment is on migration across database platforms because, as pointed out by Noel, they’re expensive, time consuming and very costly.
According to Forrester a Database Compatibility Layer (DCL) is a “database layer that supports another DBMS’s proprietary SQL extensions, data types, and data structures natively. Existing applications can transparently access the newly migrated database with zero or minimal changes.” By extension, this should also mean that an application could easily access one database and a completely different one using the same code base (assuming zero changes, of course). For the sake of discussion let’s assume that a DCL exists that exhibits just that characteristic – complete interoperability at the connectivity layer. Not just for migration, which is of course the desired use, but for day to day use. What would that mean for cloud computing providers – both internal and external?
ENABLING IT as a SERVICE Based on our assumption that a DCL exists and is implemented by multiple database solution vendors, a veritable cornucopia of options becomes a lot more available for moving enterprise architectures toward IT as a Service than might be at first obvious.
Consider that applications have variable needs in terms of performance, redundancy, disaster recovery, and scalability. Some applications require higher performance, others just need a nightly or even weekly backup and some, well, some are just not that important that you can’t use other IT operations backups to restore if something goes wrong. In some cases the applications might have varying needs based on the business unit deploying them. The same application used by finance, for example, might have different requirements than the same one used by developers. How could that be? Because the developers may only be using that application for integration or testing while finance is using it for realz. It happens.
What’s more interesting, however, is how a DCL could enable a more flexible service-oriented style buffet of database choices, especially if the organization used different database solutions to support different transactional, availability, and performance goals.
If a universal DCL (or near universal at least) existed, business stakeholders – together with their IT counterparts – could pick and choose the database “service” they wished to employ based on not only the technical characteristics and operational support but also the costs and business requirements. It would also allow them to “migrate” over time as applications became more critical, without requiring a massive investment in upgrading or modifying the application to support a different back-end database.
Obviously I’m picking just a few examples that may or may not be applicable to every organization. The bigger thing here, I think, is the flexibility in architecture and design that is afforded by such a model that balances costs with operational characteristics. Monitoring of database resource availability, too, could be greatly simplified from such a layer, providing solutions that are natively supported by upstream devices responsible for availability at the application layer, which ultimately depends on the database but is often an ignored component because of the complexity currently inherent in supporting such a varied set of connectivity standards.
It should also be obvious that this model would work for a PaaS-style provider who is not tied to any given database technology. A PaaS-style vendor today must either invest effort in developing and maintaining a services layer for database connectivity or restrict customers to a single database service. The latter is fine if you’re creating a single-stack environment such as Microsoft Azure but not so fine if you’re trying to build a more flexible set of offerings to attract a wider customer base.
Again, same note as above. Providers would have a much more flexible set of options if they could rely upon what is effectively a single database interface regardless of the specific database implementation. More importantly for providers, perhaps, is the migration capability noted by the Forrester report in the first place, as one of the inhibitors of moving existing applications to a cloud computing provider is support for the same database across both enterprise and cloud computing environments.
While services layers are certainly a means to the same end, such layers are not universally supported. There’s no “standard” for them, not even a set of best practice guidelines, and the resulting application code suffers exactly the same issues as with the use of proprietary database connectivity: lock in. You can’t pick one up and move it to the cloud, or another database without changing some code. Granted, a services layer is more efficient in this sense as it serves as an architectural strategic point of control at which connectivity is aggregated and thus database implementation and specifics are abstracted from the application. That means the database can be changed without impacting end-user applications, only the services layer need be modified.
But even that approach is problematic for packaged applications that rely upon database connectivity directly and do not support such service layers. A DCL, ostensibly, would support packaged and custom applications if it were implemented properly in all commercial database offerings.
CONNECTIVITY CARTELAnd therein lies the problem – if it were implemented properly in all commercial database offerings. There is a risk here of a connectivity cartel arising, where database vendors form alliances with other database vendors to support a DCL while “locking out” vendors whom they have decided do not belong.
Because the DCL depends on supporting “proprietary SQL extensions, data types, and data structures natively” there may be a need for database vendors to collaborate as a means to properly support those proprietary features. If collaboration is required, it is possible to deny that collaboration as a means to control who plays in the market. It’s also possible for a vendor to slightly change some proprietary feature in order to “break” the others’ support. And of course the sheer volume of work necessary for a database vendor to support all other database vendors could overwhelm smaller database vendors, leaving them with no real way to support everyone else.
The idea of a DCL is an interesting one, and it has its appeal as a means to forward compatibility for migration – both temporary and permanent. Will it gain in popularity? For the latter, perhaps, but for the former? Less likely. The inherent difficulties and scope of supporting such a wide variety of databases natively will certainly inhibit any such efforts. Solutions such as a REST-ful interface, a la PHP REST SQL or a JSON-HTTP based solution like DBSlayer may be more appropriate in the long run if they were to be standardized.
And by standardized I mean standardized with industry-wide and agreed upon specifications. Not more of the “more what you’d call ‘guidelines’ than actual rules” that we already have.
**F5 is a regular contributor on Data Center POST
Wednesday, February 23, 2011
Database Security: A Breakthrough in Database Patching
- Dan Sarel, Vice President of Products at Sentrigo (www.sentrigo.com), says:
Hedgehog vPatch is useful for data centers
It boils down to providing protection for databases during a crucial period of time. Patches are regularly issued by database vendors to address known vulnerabilities in their DBMS software. But for a variety of reasons, enterprises are not always able to install those patches in a timely manner; often, they are not installed at all. Yet, once the patch is released, hackers know about the weakness, and can exploit systems that are not yet patched, gaining access to sensitive records. That’s where vPatch comes in. It gives organizations a reliable way to protect their databases and bridge the security gap that exists between the issuance of vendor patch updates and the actual installation of those patches.
The Hedgehog vPatch
Hedgehog vPatch is based on database patching that Sentrigo pioneered in 2008 when it unveiled virtual patching technology. It combines a small non-intrusive sensor on each database server with a set of frequently updated rules to detect in memory any attempts to exploit known vulnerabilities as well as common hacking techniques. The system can be configured to respond in a variety of ways: issuing a real-time alert, terminating the session, placing the user in quarantine and updating the enterprise firewall to block access from the source IP address. Sentrigo updates the virtual patching rules when we discover new vulnerabilities, when new vulnerabilities are made public, and when each new vendor patch is released, to protect customer systems from the latest exploits.
Benefit for data center/IT managers
In addition to protecting databases during the critical period in between the issuance of vendor patches and the actual installation of those patches, Hedgehog vPatch solves two of the major problems that delay and often prevent the installation of vendor patch updates. Because the Hedgehog sensor is read-only and installed as a user process, it doesn’t make any changes to the DBMS software itself. Therefore, it does not require any database downtime, and does not require the same level of application testing that a physical patch requires – major reasons many organizations delay patching.
An additional benefit of virtual patching is that the system can protect older versions of databases that are still in use in the organization, yet are no longer supported by vendor patches. This can be a significant issue, as frequently the vulnerability discovered in the current release of a DBMS is also present in earlier versions, but without a patch the system is at risk.
We’ve seen two primary drivers that lead an organization to deploy Hedgehog vPatch. First, most organizations have a stated patching policy that dictates how soon the update must be applied. The policy is often a result of a law or other regulation, such as PCI-DSS or Sarbanes-Oxley that mandate timely patching. Often, for the reasons stated earlier, they simply cannot meet this policy, and it becomes a compliance issue. We have many customers who use virtual patching as part of their overall patching strategy and satisfy governance standards. The second driver is security – if a breach does occur, it is very likely to be well publicized, triggering the question: “How long will it take my company to recover from damaged reputation, potential fines and the loss of customers that often results from a breach?”
The best advice we can give Data Center POST readers is to apply vendor patches as soon as possible after they are released. But, we know from experience that this is not always possible, and this often becomes an issue during many compliance audits. Hedgehog vPatch is a quick and pain-free way to compensate for not being able to apply patches immediately, and can be used to meet compliance requirements.
Hedgehog vPatch is useful for data centers
It boils down to providing protection for databases during a crucial period of time. Patches are regularly issued by database vendors to address known vulnerabilities in their DBMS software. But for a variety of reasons, enterprises are not always able to install those patches in a timely manner; often, they are not installed at all. Yet, once the patch is released, hackers know about the weakness, and can exploit systems that are not yet patched, gaining access to sensitive records. That’s where vPatch comes in. It gives organizations a reliable way to protect their databases and bridge the security gap that exists between the issuance of vendor patch updates and the actual installation of those patches.
The Hedgehog vPatch
Hedgehog vPatch is based on database patching that Sentrigo pioneered in 2008 when it unveiled virtual patching technology. It combines a small non-intrusive sensor on each database server with a set of frequently updated rules to detect in memory any attempts to exploit known vulnerabilities as well as common hacking techniques. The system can be configured to respond in a variety of ways: issuing a real-time alert, terminating the session, placing the user in quarantine and updating the enterprise firewall to block access from the source IP address. Sentrigo updates the virtual patching rules when we discover new vulnerabilities, when new vulnerabilities are made public, and when each new vendor patch is released, to protect customer systems from the latest exploits.
Benefit for data center/IT managers
In addition to protecting databases during the critical period in between the issuance of vendor patches and the actual installation of those patches, Hedgehog vPatch solves two of the major problems that delay and often prevent the installation of vendor patch updates. Because the Hedgehog sensor is read-only and installed as a user process, it doesn’t make any changes to the DBMS software itself. Therefore, it does not require any database downtime, and does not require the same level of application testing that a physical patch requires – major reasons many organizations delay patching.
An additional benefit of virtual patching is that the system can protect older versions of databases that are still in use in the organization, yet are no longer supported by vendor patches. This can be a significant issue, as frequently the vulnerability discovered in the current release of a DBMS is also present in earlier versions, but without a patch the system is at risk.
We’ve seen two primary drivers that lead an organization to deploy Hedgehog vPatch. First, most organizations have a stated patching policy that dictates how soon the update must be applied. The policy is often a result of a law or other regulation, such as PCI-DSS or Sarbanes-Oxley that mandate timely patching. Often, for the reasons stated earlier, they simply cannot meet this policy, and it becomes a compliance issue. We have many customers who use virtual patching as part of their overall patching strategy and satisfy governance standards. The second driver is security – if a breach does occur, it is very likely to be well publicized, triggering the question: “How long will it take my company to recover from damaged reputation, potential fines and the loss of customers that often results from a breach?”
The best advice we can give Data Center POST readers is to apply vendor patches as soon as possible after they are released. But, we know from experience that this is not always possible, and this often becomes an issue during many compliance audits. Hedgehog vPatch is a quick and pain-free way to compensate for not being able to apply patches immediately, and can be used to meet compliance requirements.
Subscribe to:
Posts (Atom)



